Druggable regions in the dengue virus envelope glycoprotein and methods of using the same

ABSTRACT

The present invention relates to novel druggable regions discovered in dengue virus envelope glycoprotein, or dengue virus E protein, which is a class II viral E protein. The present invention further relates to methods of using the druggable regions to screen potential candidate therapeutics for diseases caused by viruses having class II E proteins, e.g. viral fusion inhibitors.

This patent application is a continuation of Application No.PCT/US04/012433, filed Apr. 22, 2004, which claims the benefit ofpriority to U.S. Provisional Patent Application Ser. No. 60/464,873,filed Apr. 22, 2003 and U.S. Provisional Patent Application Ser. No.60/505,654, filed Sep. 24, 2003.

GOVERNMENT SUPPORT

The subject invention was made in part with government support underGrant Number CA 13202 awarded by the NIH and Grant Number LT00538awarded by the Human Frontier Science Program Accordingly, the U.S.Government has certain rights in this invention.

FIELD OF THE INVENTION

The present invention relates to novel druggable regions in the denguevirus envelope glycoprotein and methods of using the same, e.g. for drugdiscovery.

BACKGROUND OF THE INVENTION

Dengue virus, a member of the flavivirus family, imposes one of thelargest social and economic burdens of any mosquito-borne viralpathogen. There is no specific treatment for infection, and control ofdengue virus by vaccination has proved elusive. Several otherflaviviruses are important human pathogens, including yellow fever, WestNile, tick-borne encephalitis (TBE) and Japanese encephalitis viruses(JE).

Three structural proteins (“C”, “M”, and “E”) and a lipid bilayerpackage the positive-strand RNA genome of flaviviruses. The corenucleocapsid protein, C, assembles with RNA on the cytosolic face of theendoplasmic reticulum membrane. The assembling core buds through the ERmembrane, thereby acquiring an envelope that contains the major envelopeglycoprotein, E, and the so-called precursor membrane protein, PrM. Theparticle passes through the secretory pathway, where a furin-likeprotease cleaves PrM to M in a late trans-Golgi compartment. Thecleavage, which removes most of the ectodomain of PrM, releases aconstraint on E and primes the particle for low-pH triggered membranefusion. Uncleaved, immature particles are not fusion competent.

Enveloped viruses enter cells by membrane fusion. E, which mediates bothreceptor binding and fusion, is a so-called “class II” viral fusionprotein. Two classes of viral “fusion machines” have been identified sofar. Class I viral fusion proteins include those of the myxo- andparamyxoviruses (e.g., influenza), the retroviruses (e.g., HIV), and thefiloviruses (e.g., Ebola). Class II fusion proteins are found in notonly the flaviviruses (yellow fever, West Nile, etc.), but also thealphaviruses (Semliki Forest virus, Sindbis virus, etc. . . . ), as wellas Hepatitis C. The structural characteristics of the two classes arequite different, but both accomplish the same “reaction”—viz., fusion oftwo lipid bilayers.

The more familiar class I fusion proteins, exemplified by thehaemagglutinin (HA) of influenza virus and gp120/gp41 of HIV, have a“fusion peptide” at or near the N-terminus of an internal cleavagepoint. This hydrophobic and glycine-rich segment, buried in thecleaved-primed trimer of the class I fusion protein, emerges when alarge-scale conformational rearrangement is triggered by low pH (in thecase of HA), receptor binding (in the case of gp120/gp41), or othercell-entry related signal. The likely sequence of events that followinclude an interaction of the fusion peptide with the target-cellmembrane and a refolding of the trimer. The latter step brings togetherthe fusion peptide and viral-membrane anchor, thereby drawing togetherthe cellular and viral membranes and initiating the bilayer fusionprocess.

The class II proteins, found so far in flaviviruses and alphaviruses,have evolved a structurally different but mechanistically related fusionarchitecture. As in class I proteins, a proteolytic cleavage (of PrM toM in flaviviruses, or of pE2 to E2 in alphaviruses) yields maturevirions, with the fusion proteins in a metastable conformation, primedfor fusion. The fusion peptide, an internal loop at the tip of anelongated subdomain of the protein, is buried at a protein interface andbecomes exposed in the conformational change initiated by exposure tolow pH.

The mechanism of fusion of class II viral fusion proteins is notwell-understood, and there are no therapeutics that can specificallyinhibit the fusion of such proteins. Only the pre-fusion structures ofone flaviviral and one alphaviral envelope protein have been determinedto date. Because fusion is a key step in viral infectivity, a betterunderstanding of the mechanism of class II envelope proteins andidentification of druggable regions within such proteins will furtherdevelopment of therapeutics that can specifically inhibit viralinfection by flaviviruses, alphaviruses, and hepatitis viruses.

SUMMARY OF THE INVENTION

Dengue virus E protein in both its pre- and post-fusion conformationshas been crystallized and the structures solved as described in detailbelow, thereby providing information about the structure of thepolypeptide, and druggable regions, domains and the like containedtherein, all of which may be used in rational-based drug design efforts.

Accordingly, the present invention provides in part novel druggableregions in viral class II E proteins. The interaction of a drug withsuch regions, or the modulation of the activity of such regions with adrug, could inhibit viral fusion and hence viral infectivity. In oneaspect, the present invention provides methods of screening compoundsagainst these druggable regions in order to discover a candidatetherapeutic for a disease caused by a virus having a class II protein,for example a small molecule viral fusion inhibitor. Diseases for whicha therapeutic candidate may be screened include dengue fever, denguehemorrhagic fever, tick-borne encephalitis, West Nile virus disease,yellow fever, Kyasanur Forest disease, louping ill, hepatitis C, RossRiver virus disease, and O'nyong fever. In one embodiment, a method foridentifying a candidate therapeutic for a disease caused by a virushaving class II E protein, comprises contacting a class II E proteinwhich comprises a druggable region with a compound, wherein binding ofsaid compound indicates a candidate therapeutic. Compounds may be incertain embodiments be selected from the following classes of compounds:polypeptides, peptidomimetics, and small molecules, and may be selectedfrom a library of compounds. Such a library may be generated bycombinatorial synthetic methods. Binding may be assayed either in vitroor in vivo. In certain embodiments of this method, the protein is denguevirus E protein and comprises at least one residue from a druggableregion of dengue virus E protein. Such druggable regions also may beutilized in the structure determination, drug screening, drug design,and other methods described and claimed herein.

In one embodiment, the druggable region is comprised of the k1 hairpinor a portion thereof. In certain embodiments, the k1 hairpin may becomprised of at least one of residues 268-280 of a dengue virus Eprotein or the homologous residues in other class II E protein. In otherembodiments, the druggable region or active site region may be comprisedof the k1 hairpin and at least one of residues 47-54, 128-137, and187-207.

In yet another embodiment, the druggable region may comprise the regionsinvolved in the binding of residues 396-429 (the “stem” region of dengueenvelope protein E) binds to the trimeric, post-fusion form of denguevirus E protein or other flavivirus E protein. In one embodiment, thedruggable region is comprised of the stem region or a portion thereof.The stem region comprises residues 396-447, or fragments thereof, forexample 396-429 and 413-447. In another embodiment, the druggable regionis comprised of the channel in which the stem region binds. The channelis comprised of the residues at the trimer interface formed by domain IIof each subunit in the trimer. Domain II consists of residues 52-132 and193-280. A second region is the channel where the stem binds, formed byresidues in domain II.

In another embodiment, the druggable region is comprised of the domainI-III region. In certain embodiments, the domain I-III region may becomprised of at least one of residues 38-40; 143-147; 294-296; and354-365 of a dengue virus E protein or the homologous residues in otherclass II E protein. In other embodiments, the druggable region may becomprised of the domain I-domain III linker (residues 294-301).

In yet another embodiment, a druggable region is comprised of the fusionloop or a portion thereof.

Other regions of protein may in certain embodiments comprise a druggableregion. For example, the hydrophobic core beneath the k1 hairpin or aportion thereof may comprise a druggable region. In another example, adruggable region may comprise domain II or a portion thereof. In stillanother example, a druggable region may comprise domain III or a portionthereof. In other examples, the pH-dependent hinge may serve as adruggable region. Further, a region or portion of a region of the Eprotein involved in trimerization, such as for example, the regions ofdomain II involved in trimerization, may present a druggable region. Aregion or a portion of a region involved in the stem fold backconformational change may comprise a druggable region, for example, suchregions as the stem-domain II contact regions, the trimeric N terminalinner core, and C terminal outer layer surfaces on the clustered domainsII, as well as the 53-residue stem. In certain embodiments, a druggableregion may consist of the entire fragment of the E protein spanningresidues 1-395.

In another aspect, the present invention is directed towards methods foridentifying a candidate therapeutic for a disease caused by a virushaving class II E protein. In certain embodiments, such methods comprisecontacting a class II E protein which comprises a druggable region witha compound, wherein the modulation of the activity of said E proteinindicates a candidate therapeutic. In other embodiments, such methodscomprise contacting a class II E protein which comprises a druggableregion with a compound, wherein the preclusion of the movement orinteraction of said druggable region indicates a candidate therapeutic.In still other embodiments, the modulation of the function or activityof said E protein involves precluding the completion of the post-fusionconformational change. In yet another embodiment, the modulation of thefunction or activity of said E protein involves interfering with thefirst stage of the conformational change. In another embodiment, amethod for identifying a candidate therapeutic for a disease caused by avirus having class II E protein comprises contacting a class II Eprotein which comprises a druggable region with a compound, wherein theinhibition of fusion in said virus indicates a candidate therapeutic. Inyet another embodiment, a method for identifying a candidate therapeuticfor a disease caused by a virus having class II E protein, comprisingcontacting a class II E protein which comprises a druggable region witha compound, wherein the inhibition of viral infectivity of said virusindicates a candidate therapeutic. In still another embodiment, a methodfor identifying a candidate therapeutic for a disease caused by a virushaving class II E protein comprises contacting a class II E proteinwhich comprises a druggable region with a compound, wherein thereduction of at least one symptom of said disease in a subject indicatesa candidate therapeutic.

In another aspect, all of the information learned and described hereinabout class II E proteins may be used in methods of designing modulatorsof one or more of their biological activities. In one embodiment, amethod for designing a modulator for the prevention or treatment of adisease caused by a virus having class II E protein, comprises: (a)providing a three-dimensional structure for a class II E protein; (b)identifying a potential modulator for the prevention or treatment ofdisease caused by a virus having class II E protein by reference to thethree-dimensional structure; (c) contacting a class II E protein withthe potential modulator; and (d) assaying the activity of the class II Eprotein or determining the viability of the virus having said class II Eprotein after contact with the modulator, wherein a change in theactivity of the polypeptide or the viability of the virus indicates thatthe modulator may be useful for prevention or treatment of avirus-related disease or disorder. In certain embodiments, the potentialmodulator is identified by reference to the three-dimensional structureof a flavivirus E protein. In some embodiments, the flavivirus E proteinis dengue virus E protein. In other embodiments, the potential modulatoris identified by reference to the three-dimensional structure comprisinga druggable region or fragment of a flavivirus E protein.

In yet another aspect, all of the information learned and describedherein about class II E proteins may be used in methods of identifyingnew druggable regions in class II E proteins, or identifying the noveldruggable regions of the invention in class II E proteins other thandengue virus E protein. In one embodiment, a method for identifying adruggable region of a class II E protein, the method comprises: (a)obtaining crystals of a polypeptide comprising (1) an amino acidsequence comprising SEQ ID NO:2; or (2) an amino acid sequence having atleast about 85% identity with the amino acid sequence comprising SEQ IDNO:2; and having at least one biological activity of a class II Eprotein, such that the three dimensional structure of the crystallizedpolypeptide may be determined to a resolution of 3.5 Å or better; (b)determining the three dimensional structure of the crystallizedpolypeptide using X-ray diffraction; and (c) identifying a druggableregion of the crystallized polypeptide based on the three-dimensionalstructure of the crystallized polypeptide. In certain embodiments, thedruggable region is a region that binds a detergent, and/or may comprisea region of the polypeptide that is exposed upon a conformationalchange. In yet another embodiment, a method for designing a candidatemodulator for screening for modulators of a polypeptide, comprises: (a)providing the three dimensional structure of a druggable region of apolypeptide comprising (1) an amino acid sequence comprising SEQ IDNO:2; or (2) an amino acid sequence having at least about 85% identitywith the amino acid sequence comprising SEQ ID NO:2; and having at leastone biological activity of a class II E protein; and (b) designing acandidate modulator based on the three dimensional structure of thedruggable region of the polypeptide.

In yet another aspect, all of the information learned and describedherein about class II E proteins may be used in methods of identifyingmodulators of the activity of a class II E protein. In one embodiment, acomputer-assisted method for identifying an modulator of the activity ofa class II E protein, comprises: (a) supplying a computer modelingapplication with a set of structure coordinates as listed in PDBaccession numbers 1OKE or 1OAN or 1OK8 for the atoms of the amino acidresidues from any of the above-described druggable regions of class II Eprotein so as to define part or all of a molecule or complex; (b)supplying the computer modeling application with a set of structurecoordinates of a chemical entity; and (c) determining whether thechemical entity is expected to bind to or interfere with the molecule orcomplex, wherein determining whether the chemical entity is expected tobind to or interfere with the molecule or complex comprises performing afitting operation between the chemical entity and a druggable region ofthe molecule or complex, followed by computationally analyzing theresults of the fitting operation to quantify the association between thechemical entity and the druggable region. These methods may furthercomprise supplying or synthesizing the potential modulator, thenassaying the potential modulator to determine whether it modulates classII E protein activity. In another embodiment, a method for identifying apotential modulator for the prevention or treatment of a disease causedby a virus having class II E protein comprises: (a) providing the threedimensional structure of a crystallized polypeptide comprising: (1) anamino acid sequence comprising SEQ ID NO:2; or (2) an amino acidsequence having at least about 85% identity with the amino acid sequencecomprising SEQ ID NO:2; or (3) an amino acid sequence comprising atleast one druggable region of SEQ ID NO: 2; or (4) an amino acidsequence comprising a sequence having at least about 85% identity withat least one druggable region of SEQ ID NO: 2; and having at least onebiological activity of a class II E protein; (b) obtaining a potentialmodulator for the prevention or treatment of said disease based on thethree dimensional structure of the crystallized polypeptide; (c)contacting the potential modulator with a second polypeptide comprisingat least 50% identical to the amino acid sequence comprising SEQ ID NO:2 and having at least one biological activity of a class II E protein;which second polypeptide may optionally be the same as the crystallizedpolypeptide; and (d) assaying the activity of the second polypeptide,wherein a change in the activity of the second polypeptide indicatesthat the compound may be useful for prevention or treatment of a diseasecaused by a virus having class II E protein. In yet another embodiment,a method for identifying a potential modulator of a polypeptide from adatabase comprises: (a) providing the three-dimensional coordinates fora plurality of the amino acids of a polypeptide comprising: (1) an aminoacid sequence comprising SEQ ID NO:2; or (2) an amino acid sequencehaving at least about 85% identity with the amino acid sequencecomprising SEQ ID NO:2; or (3) an amino acid sequence comprising atleast one druggable region of SEQ ID NO: 2; or (4) an amino acidsequence comprising a sequence having at least about 85% identity withat least one druggable region of SEQ ID NO: 2; and having at least onebiological activity of a class II E protein; (b) identifying a druggableregion of the polypeptide; and (c) selecting from a database at leastone potential modulator comprising three dimensional coordinates whichindicate that the modulator may bind or interfere with the druggableregion.

In still another aspect the present invention provides crystallized Eproteins, fragments thereof, and E protein or protein fragmentcomplexes, and methods of using the same, in methods for determining thestructures of homologues of dengue virus E protein and its complexes(for example, the trimer of E proteins formed upon fusion with amembrane), or novel crystallized E proteins, fragments thereof, and Eprotein or protein fragment complexes. In one embodiment, a method fordetermining the crystal structure of a homolog of a polypeptidecomprises: (a) providing the three dimensional structure of a firstcrystallized polypeptide comprising (1) an amino acid sequencecomprising SEQ ID NO:2; or (2) an amino acid sequence having at leastabout 85% identity with the amino acid sequence comprising SEQ ID NO:2;or (3) an amino acid sequence comprising at least one druggable regionof SEQ ID NO: 2; or (4) an amino acid sequence comprising a sequencehaving at least about 85% identity with at least one druggable region ofSEQ ID NO: 2; and having at least one biological activity of a class IIE protein; (b) obtaining crystals of a second polypeptide comprising anamino acid sequence that is at least 50% identical to the amino acidsequence comprising SEQ ID NO: 2 and having at least one biologicalactivity of a class II E protein, such that the three dimensionalstructure of the second crystallized polypeptide may be determined to aresolution of 3.5 Å or better; and (c) determining the three dimensionalstructure of the second crystallized polypeptide by x-raycrystallography based on the atomic coordinates of the three dimensionalstructure provided in step (a). In another embodiment, a method forobtaining structural information about a molecule or a molecular complexof unknown structure comprises: (a) crystallizing the molecule ormolecular complex; (b) generating an x-ray diffraction pattern from thecrystallized molecule or molecular complex; and (c) applying at least aportion of the structure coordinates of PDB accession numbers 1OKE or1OAN to the x-ray diffraction pattern to generate a three-dimensionalelectron density map of at least a portion of the molecule or molecularcomplex whose structure is unknown. In still another embodiment, amethod for making a crystallized complex comprising a polypeptide and acandidate modulator comprises: (a) crystallizing a polypeptidecomprising (1) an amino acid sequence comprising SEQ ID NO:2; or (2) anamino acid sequence having at least about 85% identity with the aminoacid sequence comprising SEQ ID NO:2; or (3) an amino acid sequencecomprising at least one druggable region of SEQ ID NO: 2; or (4) anamino acid sequence comprising a sequence having at least about 85%identity with at least one druggable region of SEQ ID NO: 2; and havingat least one biological activity of a class II E protein; such thatcrystals of the crystallized polypeptide will diffract x-rays to aresolution of 5 Å or better; and (b) soaking the crystals in a solutioncomprising a potential modulator.

Finally, the present invention provides modulators (in certainembodiments, inhibitors) of class II E protein activity, as well aspharmaceutical compositions and kits comprising the same. Suchmodulators may in certain embodiments interact with a druggable regionof the invention. In still another aspect, the present invention isdirected toward a modulator that is a fragment of (or homolog of suchfragment or mimetic of such fragment) the druggable region of a denguevirus E protein or other viral class II E protein and competes with thatdruggable region. Modulators of any of the above-described druggableregions may be used alone or in complementary approaches to treat dengueviral or other viral infections.

In certain embodiments, a modulator interacts with the k1 hairpin so asto preclude it from moving, thereby modulating the activity of thedengue virus E protein or other flavivirus E protein. In another aspect,the present invention is directed towards a modulator that interactswith the stem region or the channel so as to preclude them frominteracting, thereby modulating the activity of the dengue virus Eprotein or other flavivirus E protein. Such modulators may be, asdescribed above, derived from either the stem region or the channel, andcompete with the stem region or channel for binding. In still otherembodiments, a modulator of class II E protein activity interacts withthe domain I-III region. The modulator may also preclude the movement ofthe domain I-III region. In another aspect, the present invention isdirected towards a modulator that interacts with the fusion loop so asto preclude it from moving, thereby modulating the activity of thedengue virus E protein or other E protein.

Further, the present invention is in part directed toward an inhibitorthat comprises SEQ ID NO: 3 or SEQ ID NO: 4, as well as fragments,homologs, variants, orthologs, and peptidomimetics thereof. Further, thepresent invention is directed towards an inhibitor that interacts withthe relevant surfaces on the clustered domains II, so that completion ofthe conformational change is inhibited and thereby inhibiting theactivity of the dengue virus E protein or other E protein. The presentinvention is also directed towards an inhibitor that interacts with thepocket beneath the k1 hairpin to infere with the first stage of theconformational change, thereby modulating the activity of the denguevirus E protein or other E protein. Such inhibitors may be used incomplementary approaches to treat dengue viral or other viralinfections.

The embodiments and practices of the present invention, otherembodiments, and their features and characteristics, will be apparentfrom the description, figures and claims that follow, with all of theclaims hereby being incorporated by this reference into this Summary.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 depicts various views of dengue E protein and its ligand-bindingpocket. FIG. 1A depicts the domain definition of dengue E. Domain I isred, domain II is yellow, and domain III is blue. FIG. 1B depicts thedengue E protein dimer, colored as in FIG. 1A, in complex withn-octyl-β-D-glucoside (β-OG). The β-OG, shown in green, is bound in ahydrophobic pocket under the k1 hairpin. The glycans in domains I and IIare shown in ball-and-stick representation in red and yellow,respectively. Disulfide bridges are shown in orange. FIG. 1C depicts anenlarged view of the k1 hairpin region, with the structure of dengue Ein the absence of β-OG (in translucent rendering) superimposed. The β-OGmolecule, shown in space-filling representation, occupies theligand-binding pocket. FIG. 1D depicts a superposition of the structuresof dengue E and TBE E, both in the absence of β-OG. Dengue E is coloredas in FIG. 1C, and TBE E is in grey. The view is the same as in FIG. 1C.FIGS. 1 and 2 were generated with BobScript (Esnouf, 1997; Kraulis,1991) and Raster 3D (Merritt and Bacon, 1997).

FIG. 2 depicts the glycan at residue 153 in dengue 2 virus E protein.FIG. 2A depicts the E protein dimer, viewed perpendicular to the dyadaxis (and the view in FIG. 1A. Both glycans are approximatelyperpendicular to the viral surface. Domain I and the attached glycan areshown in red, domain II and the attached glycan are shown in yellow, anddomain III is in blue. Disulfide bridges are shown in orange. Themolecule of n-octyl-β-D-glucoside bound in the hydrophobic pocketunderneath the k1 hairpin is in green. FIG. 2B depicts an enlargement ofthe area surrounding the glycan at residue 153 in domain I, with thestructure of TBE envelope protein superimposed (gray) onto domain I ofdengue virus E protein. The fusion peptide is highlighted in orange. Thedisulfide bridge between residues 92 and 105 is shown in green.

FIG. 3 depicts various mutations affecting the pH threshold of fusion inflaviviruses. The mutated residues line the interior of theligand-binding pocket. For unconserved residues, the residue type in thevirus in which the mutation was identified is listed first, followed bythe residue type in dengue 2. The coloring code is the same as in FIG.2.

FIG. 4 depicts the proposed subunit packing interactions in variousflaviviral icosahedral assemblies. FIG. 4A depicts the suggestedtransition from the previously studied T=1 subviral particles (Ferlenghiet al., 2001) to the fusion competent T=1 particle at low pH. Uponacidification, domain II is proposed to swing out about a hinge at thedomain I/II interface, creating homotrimeric contacts at the threefoldaxis. Clusters of three fusion peptides are displayed at the tip of eachtrimer. FIG. 4B depicts the packing in T=3 virus-like particles deducedfrom image reconstructions of dengue virions (Kuhn et al., 2002). The180 subunits are not related by local threefold symmetry. FIG. 4Cdepicts the suggested packing intermediate for the T=3 particle at lowpH. E is shown in its native (high pH) conformation. Since all monomersare related by local threefold symmetry, the low-pH conformationalchange, will result in the formation of trimers, as in FIG. 4A.

FIG. 5 depicts various views of the structure of the dimer of dengue Esoluble fragment (sE) in the mature virus particle. FIG. 5A depicts thethree domains of dengue sE. Domain I is red, domain II is yellow, anddomain III is blue. A 53-residue “stem” segment links the stably foldedsE fragment with the C-terminal transmembrane anchor. FIG. 5B depictsthe sE dimer7. This is the conformation of E in the mature virusparticle and in solution above the fusion pH. FIG. 5C depicts thepacking of E on the surface of the virus. Electron cryomicroscopy imagereconstructions show that E dimers pack in a non-equivalent T=3icosahedral lattice10. Note the absence of local threefold symmetry formost dimers.

FIG. 6 depcits the trimer formation and membrane insertion of dengue Eprotein. FIG. 6A depicts an electron micrograph showing E trimersinserted into liposomes. The liposomes are heavily decorated with Etrimers. A large portion of the trimer can be seen protruding from themembrane. The samples were stained with uranyl formate (see Example 2).Scale bar=500 Å. FIG. 6B depicts the results of gel electrophoresisindicating that E trimers can be covalently cross-linked with ethyleneglycol bis-succinimidyl-succinate (EGS) after insertion into liposomes.Lane 1: E solubilized from liposomes at pH 5.5 and not cross-linked.Lane 2: E solubilized from liposomes at pH 5.5 and cross-linked withEGS. Lane 3: E cross-linked with EGS at pH 7 in the absence of lipid.

FIG. 7 depicts various views of the domain rearrangements in the denguesE monomer during the transition to trimer. FIG. 7A depicts an sEmonomer in its pre-fusion conformation. This is the structure adopted inmature virus particles and in solution at pH>7—conditions under which Eis a dimer. FIG. 7B depicts a schematic representation of the secondarystructure of domain I and links to domains II and III in the pre-fusionconformation. FIG. 7C depicts an sE monomer in its post-fusionconformation, as seen in sE trimers. The three domains have rotated andshifted with respect to each other, bringing the C-terminus 39 Å closerto the fusion loop (orange). The fusion loop retains essentially thesame conformation before and after fusion. FIG. 7D depicts the secondarystructure of domain I and its links to domains II and III in thetrimeric, post-fusion conformation. The domain I-III linker insertsbetween strands A0 and C0. The C-terminal region of A0 flips out,switches to the other b-sheet, and creates an annular trimer contactwith the two other A0 strands in the trimer.

FIG. 8 depicts various views of the dengue sE trimer. FIG. 8A depicts aribbon diagram with domain I in red, domain II in yellow, and domain IIIin blue. Hydrophobic side chains in the fusion loop (orange) areexposed. The expected position of the hydrocarbon layer of the fusedmembrane is shown in green. Representative lipids are shown to scale.The trimer only penetrates about 6 Å into the hydrocarbon layer of themembrane. A chloride ion (black sphere) binds near the fusion loop. FIG.8B depicts a surface representation of the trimer. The C-terminus of sEis located 60 Å from the membrane. The crystallized sE fragment ends 53residues short of the viral transmembrane domain. This 53-residue “stem”could easily reach the membrane. The dashed grey arrow indicates themost likely location for the stem (see Example 2). An extended cavity isvisible near the tip of the trimer; access to this cavity will probablybe occluded by the stem. The glycan on Asn67 and representative lipidsare shown in space-filling representation. FIG. 8C depicts themembrane-distal end of the trimer, where most trimer contacts areformed. The view is along the threefold axis. FIG. 8D depicts a close-upof c showing trimer contacts. The A0B0 loop forms an annular trimercontact. The domain I-III linker (purple) adopts an extendedconformation and forms additional trimer contacts. FIG. 8E depicts aclose-up of the aromatic anchor formed by the three fusion loops(orange). Three strictly conserved hydrophobic residues interact withthe membrane: Trp101, Leu107 and Phe108. The three clustered fusionloops form a nonpolar, bowl-shaped apex, which is underpinned by a smallhydrophobic core. Underneath, a chloride ion (black sphere) forms atrimer contact.

FIG. 9 depicts a proposed mechanism for fusion mediated by class IIviral fusion proteins. Full-length E is represented as in FIG. 5C, withthe stem and viral transmembrane domains in dark blue. FIG. 9A E bindsto a receptor on the cell surface and the virion is internalized to anendosome. FIG. 9B Reduced pH in the endosome causes domain II to hingeoutward from the virion surface, thus destroying dimer contacts andexposing the fusion loop. E monomers are free to rearrange laterally inthe plane of the viral membrane. FIG. 9C The fusion loop inserts intothe hydrocarbon layer of the host-cell membrane, promoting trimerformation. FIG. 9D The formation of trimer contacts spreads from thefusion loop at the tip of the trimer, to the base of the trimer. DomainIII shifts and rotates to create trimer contacts, causing the C-terminalportion of E to fold back towards the fusion loop. The energy releasedby this refolding bends the apposed membranes. FIG. 9E Thermal motionsin the lipid bilayer lead to the spontaneous fusion first of the cismonolayers (“hemifusion”), and then of the trans monolayers to form alipidic fusion pore. This process is facilitated by the creation ofadditional trimer contacts, between the stem and domain II.

FIG. 10 depicts fluorescence depolarization binding data for a peptidecorresponding to residues 396-429 (in the “stem” region) of dengueenvelope protein (E). FIG. 10A depicts a Kd analysis of the peptide'sbinding affinity to the trimeric, post-fusion form of E. FIG. 10Bdepicts a competitive binding analysis with the fluorescent andunlabeled peptide.

FIG. 11 depicts a fluorescence depolarization Kd analysis to measure theaffinity between a peptide corresponding to residues 413-447 (in the“stem” region) of dengue envelope protein (E). and the trimeric,post-fusion form of E.

DETAILED DESCRIPTION OF THE INVENTION

A. General

We have determined the structures of the E protein in both itspre-fusion and post-fusion conformations.

The pre-fusion structure was determinned by solving the structure of asoluble fragment (residues 1-394) of the E protein from dengue virustype 2. This fragment contains all but about 45 residues of theE-protein ectodomain (FIG. 1A). It resembles closely, in its dimericstructure and in the details of its protein fold, the E protein fromtick-borne encephalitis (TBE) virus, studied previously. We haveexamined crystals grown in both the presence and the absence of thedetergent n-octyl-β-D-glucoside, β-OG. The key difference between thetwo structures is a local rearrangement of the “k1” β-hairpin (residues268-280) and the concomitant opening up of a hydrophobic pocket,occupied by a molecule of β-OG. Mutations affecting the pH threshold forfusion map to the hydrophobic pocket, which we propose is a hinge pointin the fusion-activating conformational change. Detergent binding marksthe k1 β-hairpin and associated pocket as a potential target for viralfusion inhibitors. We have also discovered another region, the domain1-3 region, which may serve as a target for viral fusion inhibitors.

The post-fusion structure of the soluble E ectodomain (sE) in itstrimeric, post-fusion state reveals striking differences from thedimeric, pre-fusion form. The elongated trimer bears three “fusionloops” at one end, to insert into the host-cell membrane. Theirstructure allows us to model directly how they interact with a lipidbilayer. The protein folds back on itself, directing its C-terminustowards the fusion loop. We propose a fusion mechanism driven byessentially irreversible conformational changes in dengue virus Eprotein and facilitated by fusion-loop insertion into the outer bilayerleaflet. Specific features of the folded-back structure suggeststrategies for inhibiting flavivirus entry, as well as druggableregions. The regions may serve as a target for viral fusion inhibitorsand assays to discover such inhibitors.

Hence, we have discovered a variety of novel, structurally defineddruggable regions which may present targets for a specific viral fusioninhibitor for dengue virus and other viruses having class II E protein.Because dengue virus type 2 E protein is strongly homologous to otherdengue viral types, as well as other flavivirus E proteins and class IIE proteins (Lindenbach and Rice, 2001, Rey, et al. 1995, Hahn, et al.1998), these binding sites are likely present in those E proteins aswell and may serve as targets for specific viral fusion inhibitors forthose viruses.

Finally, we have also discovered that peptides corresponding to residues396-429 and 413-447 (in the “stem” region) of dengue envelope protein(E) binds with fairly high affinity and specificity to the trimeric,post-fusion form of sE, the fragment of E spanning residues 1-395, whichwe crystallized first in the pre-fusion form and then in the post-fusionform. Inhibitor peptides derived from stem sequences may blockcompletion of the conformational change by interacting with the relevantsurfaces on the clustered domains II. Such inhibitors would interferewith the second stage of the conformational change. This peptide itselfmay serve as a specific viral fusion inhibitor, or may provide the basisfrom which to design improved specific viral fusion inhibitors.

B. Definitions

For convenience, before further description of the present invention,certain terms employed in the specification, examples, and appendantclaims are collected here. These definitions should be read in light ofthe entire disclosure and understood as by a person of skill in the art.

The articles “a” and “an” are used herein to refer to one or to morethan one (i.e., to at least one) of the grammatical object of thearticle. By way of example, “an element” means one element or more thanone element.

The term “activity” of a class II E protein refers to the ability of theprotein to mediate both receptor binding and fusion between a virus anda cell.

The term “amino acid” is intended to embrace all molecules, whethernatural or synthetic, which include both an amino functionality and anacid functionality and capable of being included in a polymer ofnaturally-occurring amino acids. Exemplary amino acids includenaturally-occurring amino acids; analogs, derivatives and congenersthereof; amino acid analogs having variant side chains; and allstereoisomers of any of any of the foregoing.

The term “binding” refers to an association, which may be a stableassociation, between two molecules, e.g., between a dengue virus Eprotein or another class II E protein and a binding partner, due to, forexample, electrostatic, hydrophobic, ionic and/or hydrogen-bondinteractions under physiological conditions.

The term “complex” refers to an association between at least twomoieties (e.g. chemical or biochemical) that have an affinity for oneanother. Examples of complexes include associations betweenantigen/antibodies, lectin/carbohydrate, target polynucleotide/probeoligonucleotide, antibody/anti-antibody, receptor/ligand, enzyme/ligand,polypeptide/polypeptide, polypeptide/polynucleotide,polypeptide/co-factor, polypeptide/substrate, polypeptide/modulator,polypeptide/small molecule, and the like. “Member of a complex” refersto one moiety of the complex, such as an antigen or ligand. “Proteincomplex” or “polypeptide complex” refers to a complex comprising atleast one polypeptide.

The term “compound” as used herein refers to any agent, molecule,complex, or other entity that may be capable of binding to orinteracting with a protein.

The term “conserved residue” refers to an amino acid that is a member ofa group of amino acids having certain common properties. The term“conservative amino acid substitution” refers to the substitution(conceptually or otherwise) of an amino acid from one such group with adifferent amino acid from the same group. A functional way to definecommon properties between individual amino acids is to analyze thenormalized frequencies of amino acid changes between correspondingproteins of homologous organisms (Schulz, G. E. and R. H. Schirmer.,Principles of Protein Structure, Springer-Verlag). According to suchanalyses, groups of amino acids may be defined where amino acids withina group exchange preferentially with each other, and therefore resembleeach other most in their impact on the overall protein structure(Schulz, G. E. and R. H. Schirmer, Principles of Protein Structure,Springer-Verlag). One example of a set of amino acid groups defined inthis manner include: (i) a charged group, consisting of Glu and Asp,Lys, Arg and His, (ii) a positively-charged group, consisting of Lys,Arg and His, (iii) a negatively-charged group, consisting of Glu andAsp, (iv) an aromatic group, consisting of Phe, Tyr and Trp, (v) anitrogen ring group, consisting of His and Trp, (vi) a large aliphaticnonpolar group, consisting of Val, Leu and Ile, (vii) a slightly-polargroup, consisting of Met and Cys, (viii) a small-residue group,consisting of Ser, Thr, Asp, Asn, Gly, Ala, Glu, Gln and Pro, (ix) analiphatic group consisting of Val, Leu, Ile, Met and Cys, and (x) asmall hydroxyl group consisting of Ser and Thr.

The term “disease caused by a flavivirus or other virus having class IIE protein” refers to any disorder or disease caused by infection of asubject with a flavivirus or other virus having class II E protein.Exemplary diseases caused by a flavivirus or other virus having class IIE protein include, but are not limited to, dengue fever, denguehemorrhagic fever, tick-borne encephalitis, West Nile virus disease,yellow fever, Kyasanur Forest disease, louping ill, hepatitis C, RossRiver virus disease, and O'nyong fever.

The term “domain”, when used in connection with a polypeptide, refers toa specific region within such polypeptide that comprises a particularstructure or mediates a particular function. In the typical case, adomain of a dengue virus E protein or other class II E protein is afragment of the polypeptide. In certain instances, a domain is astructurally stable domain, as evidenced, for example, by massspectroscopy, or by the fact that a modulator may bind to a druggableregion of the domain.

The term “domain I-III region” as used herein refers to any structuralmotif having homology to the motif comprising at least one of residues38-40; 143-147; 294-296; and 354-365 of a dengue virus E protein.

The term “domain II” as used herein refers to any structural motifhaving homology to the motif comprising at least one of residues 52-132and 193-280 of a dengue virus E protein.

The term “druggable region”, when used in reference to a polypeptide,nucleic acid, complex and the like, refers to a region of a dengue virusE protein or other class II E protein which is a target or is a likelytarget for binding an agent that reduces or inhibits viral infectivity.For a polypeptide, a druggable region generally refers to a regionwherein several amino acids of a polypeptide would be capable ofinteracting with an agent. For a polypeptide or complex thereof,exemplary druggable regions including binding pockets and sites,interfaces between domains of a polypeptide or complex, surface groovesor contours or surfaces of a polypeptide or complex which are capable ofparticipating in interactions with another molecule, such as a cellmembrane. In particular, a subject druggable region is the k1 hairpinregion and its associated detergent binding pocket. In yet anotherexample, a subject druggable region is the domain 1-3 region comprisingat least one of residues 38-40; 143-147; 294-296; and 354-365 of adengue virus E protein or the homologous residues in other class II Eprotein.

A druggable region may be described and characterized in a number ofways. For example, a druggable region may be characterized by some orall of the amino acids that make up the region, or the backbone atomsthereof, or the side chain atoms thereof (optionally with or without theCa atoms). Alternatively, a druggable region may be characterized bycomparison to other regions on the same or other molecules. For example,the term “affinity region” refers to a druggable region on a molecule(such as a dengue virus E protein or other class II E protein) that ispresent in several other molecules, in so much as the structures of thesame affinity regions are sufficiently the same so that they areexpected to bind the same or related structural analogs. An example ofan affinity region is an ATP-binding site of a protein kinase that isfound in several protein kinases (whether or not of the same origin).The term “selectivity-region” refers to a druggable region of a moleculethat may not be found on other molecules, in so much as the structuresof different selectivity regions are sufficiently different so that theyare not expected to bind the same or related structural analogs. Anexemplary selectivity region is a catalytic domain of a protein kinasethat exhibits specificity for one substrate. In certain instances, asingle modulator may bind to the same affinity region across a number ofproteins that have a substantially similar biological function, whereasthe same modulator may bind to only one selectivity region of one ofthose proteins.

Continuing with examples of different druggable regions, the term“undesired region” refers to a druggable region of a molecule that uponinteracting with another molecule results in an undesirable affect. Forexample, a binding site that oxidizes the interacting molecule (such asP-450 activity) and thereby results in increased toxicity for theoxidized molecule may be deemed a “undesired region”. Other examples ofpotential undesired regions includes regions that upon interaction witha drug decrease the membrane permeability of the drug, increase theexcretion of the drug, or increase the blood brain transport of thedrug. It may be the case that, in certain circumstances, an undesiredregion will no longer be deemed an undesired region because the affectof the region will be favorable, e.g., a drug intended to treat a braincondition would benefit from interacting with a region that resulted inincreased blood brain transport, whereas the same region could be deemedundesirable for drugs that were not intended to be delivered to thebrain.

When used in reference to a druggable region, the “selectivity” or“specificity” of a molecule such as a modulator to a druggable regionmay be used to describe the binding between the molecule and a druggableregion. For example, the selectivity of a modulator with respect to adruggable region may be expressed by comparison to another modulator,using the respective values of Kd (i.e., the dissociation constants foreach modulator-druggable region complex) or, in cases where a biologicaleffect is observed below the Kd, the ratio of the respective EC50's(i.e., the concentrations that produce 50% of the maximum response forthe modulator interacting with each druggable region).

The term “class II E protein” refers to any protein (full-length orfragment) having the sequence of a major class II virus envelopeglycoprotein, E, derived from a flavivirus, alphavirus, or hepatitisvirus. The term “dengue virus E protein” refers to a major virusenvelope glycoprotein, E, derived from a dengue fever virus of any type.The full-length 495-residue sequence of dengue virus E protein fromdengue virus type 2 is SEQ ID NO: 1:MRCIGISNRDFVEGVSGGSWVDIVLEHGSCVTTMAKNKPTLDFELIKTEAKQPATLRKYCIEAKLTNTTTDSRCPTQGEPTLNEEQDKRFVCKHSMVDRGWGNGCGLFGKGGIVTCAMFTCKKNMEGKIVQPENLEYTVVITPHSGEEHAVGNDTGKHGKEVKITPQSSITEAELTGYGTVTMECSPRTGLDFNEMVLLQMKDKAWLVHRQWFLDLPLPWLPGADTQGSNWIQKETLVTFKNPHAKKQDVVVLGSQEGAMHTALTGATEIQMSSGNLLFTGHLKCRLRMDKLQLKGMSYSMCTGKFKVVKEIAETQHGTIVIRVQYEGDGSPCKTPFEIMDLEKRHVLGRLTTVNPIVTEKDSPVNIEAEPPFGDSYRIGVEPGQLKLDWFKKGSSIGQMFETTMRGAKRMAILGDTAWDFGSLGGVFTSIGKALHQVFGAIYGAAFSGVSWTMKILIGVIITWIGMNSRSTSLSVSLVLVGIVTLYLGVMVQA. The term “denguevirus E protein” encompasses sequences with at least 85% identity tothis sequence, such as, for example an E protein from dengue virus type1, and also encompasses fragments of the full-length E protein. Forexample, the truncated dengue virus E protein used herein is SEQ ID NO:2: and consists of residues 1-394 of full-length dengue virus type 2 Eprotein: MRCIGISNRDFVEGVSGGSWVDIVLEHGSCVTTMAKNKPTLDFELIKTEAKQPATLRKYCIEAKLTNTTTDSRCPTQGEPTLNEEQDKRFVCKHSMVDRGWGNGCGLFGKGGIVTCAMFTCKKNMEGKIVQPENLEYTVVITPHSGEEHAVGNDTGKHGKEVKITPQSSITEAELTGYGTVTMECSPRTGLDFNEMVLLQMKDKAWLVHRQWFLDLPLPWLPGADTQGSNWIQKETLVTFKNPHAKKQDVVVLGSQEGAMHTALTGATEIQMSSGNLLFTGHLKCRLRMDKLQLKGMSYSMCTGKFKVVKEIAETQHGTIVIRVQYEGDGSPCKTPFEIMDLEKRHVLGRLTTVNPIVTEKDSPVNIEAEPPFGDSYIIIGVEPGQLKLD WFKK. Otherfragments may be shorter or longer. Such E proteins and proteinfragments may be produced by any method known in the art, includingpurification from natural sources, recombinant methods, and peptidesynthesis. Such proteins may be produced in a soluble form (referred toherein as “sE”, e.g. lacking transmembrane regions, or solubilized usingappropriate reagents (such as a detergent).

The term “gene” refers to a nucleic acid comprising an open readingframe encoding a polypeptide having exon sequences and optionally intronsequences. The term “intron” refers to a DNA sequence present in a givengene which is not translated into protein and is generally found betweenexons.

The term “having substantially similar biological activity”, when usedin reference to two polypeptides, refers to a biological activity of afirst polypeptide which is substantially similar to at least one of thebiological activities of a second polypeptide. A substantially similarbiological activity means that the polypeptides carry out a similarfunction, e.g., a similar enzymatic reaction or a similar physiologicalprocess, etc. For example, two homologous proteins may have asubstantially similar biological activity if they are involved in asimilar enzymatic reaction, e.g., they are both kinases which catalyzephosphorylation of a substrate polypeptide, however, they may phosphorydifferent regions on the same protein substrate or different substrateproteins altogether. Alternatively, two homologous proteins may alsohave a substantially similar biological activity if they are bothinvolved in a similar physiological process, e.g., transcription. Forexample, two proteins may be transcription factors, however, they maybind to different DNA sequences or bind to different polypeptideinteractors. Substantially similar biological activities may also beassociated with proteins carrying out a similar structural role, forexample, two membrane proteins.

The term “isolated polypeptide” refers to a polypeptide, in certainembodiments prepared from recombinant DNA or RNA, or of syntheticorigin, or some combination thereof, which (1) is not associated withproteins that it is normally found with in nature, (2) is isolated fromthe cell in which it occurs, (3) is isolated free of other proteins fromthe same cellular source, (4) is expressed by a cell from a differentspecies, or (5) does not occur in nature.

The term “isolated nucleic acid” refers to a polynucleotide of genomic,cDNA, or synthetic origin or some combination there of, which (1) is notassociated with the cell in which the “isolated nucleic acid” is foundin nature, or (2) is operably linked to a polynucleotide to which it isnot linked in nature.

The term “k1 hairpin”, or “E protein k1 h”, as used herein refers to anystructural motif having homology to the motif comprising at leastresidues 268-280, and in some embodiments further comprising at leastone of residues 47-54, 128-137, and 187-207.

The term “mammal” is known in the art, and exemplary mammals includehumans, primates, bovines, porcines, canines, felines, and rodents(e.g., mice and rats).

The term “modulation”, when used in reference to a functional propertyor biological activity or process (e.g., enzyme activity or receptorbinding), refers to the capacity to either up regulate (e.g., activateor stimulate), down regulate (e.g., inhibit or suppress) or otherwisechange a quality of such property, activity or process. In certaininstances, such regulation may be contingent on the occurrence of aspecific event, such as activation of a signal transduction pathway,and/or may be manifest only in particular cell types.

The term “modulator” refers to a polypeptide, nucleic acid,macromolecule, complex, molecule, small molecule, compound, species orthe like (naturally-occurring or non-naturally-occurring), or an extractmade from biological materials such as bacteria, plants, fungi, oranimal cells or tissues, that may be capable of causing modulation.Modulators may be evaluated for potential activity as modulators oractivators (directly or indirectly) of a functional property, biologicalactivity or process, or combination of them, (e.g., agonist, partialantagonist, partial agonist, inverse agonist, antagonist, anti-microbialagents, modulators of microbial infection or proliferation, and thelike) by inclusion in assays. In such assays, many modulators may bescreened at one time. The activity of a modulator may be known, unknownor partially known. The term “inhibitor” refers to a polypeptide,nucleic acid, macromolecule, complex, molecule, small molecule,compound, species or the like (naturally-occurring ornon-naturally-occurring), or an extract made from biological materialssuch as bacteria, plants, fungi, or animal cells or tissues, that may becapable of down-regulating or suppressing a functional property orbiological activity or process.

The term “motif” refers to an amino acid sequence that is commonly foundin a protein of a particular structure or function. Typically, aconsensus sequence is defined to represent a particular motif. Theconsensus sequence need not be strictly defined and may containpositions of variability, degeneracy, variability of length, etc. Theconsensus sequence may be used to search a database to identify otherproteins that may have a similar structure or function due to thepresence of the motif in its amino acid sequence. For example, on-linedatabases may be searched with a consensus sequence in order to identifyother proteins containing a particular motif. Various search algorithmsand/or programs may be used, including FASTA, BLAST or ENTREZ. FASTA andBLAST are available as a part of the GCG sequence analysis package(University of Wisconsin, Madison, Wis.). ENTREZ is available throughthe National Center for Biotechnology Information, National Library ofMedicine, National Institutes of Health, Bethesda, Md.

The term “nucleic acid” refers to a polymeric form of nucleotides,either ribonucleotides or deoxynucleotides or a modified form of eithertype of nucleotide. The terms should also be understood to include, asequivalents, analogs of either RNA or DNA made from nucleotide analogs,and, as applicable to the embodiment being described, single-stranded(such as sense or antisense) and double-stranded polynucleotides.

The term “polypeptide”, and the terms “protein” and “peptide” which areused interchangeably herein, refers to a polymer of amino acids.Exemplary polypeptides include gene products, naturally-occurringproteins, homologs, orthologs, paralogs, fragments, and otherequivalents, variants and analogs of the foregoing.

The terms “polypeptide fragment” or “fragment”, when used in referenceto a reference polypeptide, refers to a polypeptide in which amino acidresidues are deleted as compared to the reference polypeptide itself,but where the remaining amino acid sequence is usually identical to thecorresponding positions in the reference polypeptide. Such deletions mayoccur at the amino-terminus or carboxy-terminus of the referencepolypeptide, or alternatively both. Fragments typically are at least 5,6, 8 or 10 amino acids long, at least 14 amino acids long, at least 20,30, 40 or 50 amino acids long, at least 75 amino acids long, or at least100, 150, 200, 300, 500 or more amino acids long. A fragment can retainone or more of the biological activities of the reference polypeptide.In certain embodiments, a fragment may comprise a druggable region, andoptionally additional amino acids on one or both sides of the druggableregion, which additional amino acids may number from 5, 10, 15, 20, 30,40, 50, or up to 100 or more residues. Further, fragments can include asub-fragment of a specific region, which sub-fragment retains a functionof the region from which it is derived. In another embodiment, afragment may have immunogenic properties.

The term “purified” refers to an object species that is the predominantspecies present (i.e., on a molar basis it is more abundant than anyother individual species in the composition). A “purified fraction” is acomposition wherein the object species comprises at least about 50percent (on a molar basis) of all species present. In making thedetermination of the purity of a species in solution or dispersion, thesolvent or matrix in which the species is dissolved or dispersed isusually not included in such determination; instead, only the species(including the one of interest) dissolved or dispersed are taken intoaccount. Generally, a purified composition will have one species thatcomprises more than about 80 percent of all species present in thecomposition, more than about 85%, 90%, 95%, 99% or more of all speciespresent. The object species may be purified to essential homogeneity(contaminant species cannot be detected in the composition byconventional detection methods) wherein the composition consistsessentially of a single species. A skilled artisan may purify a denguevirus E protein or other class II E protein using standard techniquesfor protein purification in light of the teachings herein. Purity of apolypeptide may be determined by a number of methods known to those ofskill in the art, including for example, amino-terminal amino acidsequence analysis, gel electrophoresis, mass-spectrometry analysis andthe methods described in the Exemplification section herein.

The terms “recombinant protein” or “recombinant polypeptide” refer to apolypeptide which is produced by recombinant DNA techniques. An exampleof such techniques includes the case when DNA encoding the expressedprotein is inserted into a suitable expression vector which is in turnused to transform a host cell to produce the protein or polypeptideencoded by the DNA.

The term “small molecule” refers to a compound, which has a molecularweight of less than about 5 kD, less than about 2.5 kD, less than about1.5 kD, or less than about 0.9 kD. Small molecules may be, for example,nucleic acids, peptides, polypeptides, peptide nucleic acids,peptidomimetics, carbohydrates, lipids or other organic (carboncontaining) or inorganic molecules. Many pharmaceutical companies haveextensive libraries of chemical and/or biological mixtures, oftenfungal, bacterial, or algal extracts, which can be screened with any ofthe assays of the invention. The term “small organic molecule” refers toa small molecule that is often identified as being an organic ormedicinal compound, and does not include molecules that are exclusivelynucleic acids, peptides or polypeptides.

The term “soluble” as used herein with reference to a dengue virus Eprotein or other class II E protein or other protein, means that uponexpression in cell culture, at least some portion of the polypeptide orprotein expressed remains in the cytoplasmic fraction of the cell anddoes not fractionate with the cellular debris upon lysis andcentrifugation of the lysate. Solubility of a polypeptide may beincreased by a variety of art recognized methods, including fusion to aheterologous amino acid sequence, deletion of amino acid residues, aminoacid substitution (e.g., enriching the sequence with amino acid residueshaving hydrophilic side chains), and chemical modification (e.g.,addition of hydrophilic groups). The solubility of polypeptides may bemeasured using a variety of art recognized techniques, including,dynamic light scattering to determine aggregation state, UV absorption,centrifugation to separate aggregated from non-aggregated material, andSDS gel electrophoresis (e.g., the amount of protein in the solublefraction is compared to the amount of protein in the soluble andinsoluble fractions combined). When expressed in a host cell, thepolypeptides of the invention may be at least about 1%, 2%, 5%, 10%,20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more soluble, e.g., at leastabout 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more ofthe total amount of protein expressed in the cell is found in thecytoplasmic fraction. In certain embodiments, a one liter culture ofcells expressing a dengue virus E protein or other class II E proteinwill produce at least about 0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 30, 40, 50milligrams or more of soluble protein. In an exemplary embodiment, adengue virus E protein or other class II E protein is at least about 10%soluble and will produce at least about 1 milligram of protein from aone liter cell culture.

The term “structural motif”, when used in reference to a polypeptide,refers to a polypeptide that, although it may have different amino acidsequences, may result in a similar structure, wherein by structure ismeant that the motif forms generally the same tertiary structure, orthat certain amino acid residues within the motif, or alternativelytheir backbone or side chains (which may or may not include the Cα atomsof the side chains) are positioned in a like relationship with respectto one another in the motif.

The term “test compound” refers to a molecule to be tested by one ormore screening method(s) as a putative modulator of a dengue virus Eprotein or other class II E protein or other biological entity orprocess. A test compound is usually not known to bind to a target ofinterest. The term “control test compound” refers to a compound known tobind to the target (e.g., a known agonist, antagonist, partial agonistor inverse agonist). The term “test compound” does not include achemical added as a control condition that alters the function of thetarget to determine signal specificity in an assay. Such controlchemicals or conditions include chemicals that 1) nonspecifically orsubstantially disrupt protein structure (e.g., denaturing agents (e.g.,urea or guanidinium), chaotropic agents, sulfhydryl reagents (e.g.,dithiothreitol and b-mercaptoethanol), and proteases), 2) generallyinhibit cell metabolism (e.g., mitochondrial uncouplers) and 3)non-specifically disrupt electrostatic or hydrophobic interactions of aprotein (e.g., high salt concentrations, or detergents at concentrationssufficient to non-specifically disrupt hydrophobic interactions).Further, the term “test compound” also does not include compounds knownto be unsuitable for a therapeutic use for a particular indication dueto toxicity of the subject. In certain embodiments, variouspredetermined concentrations of test compounds are used for screeningsuch as 0.01 mM, 0.1 mM, 1.0 mM, and 10.0 mM. Examples of test compoundsinclude, but are not limited to, peptides, nucleic acids, carbohydrates,and small molecules.

The term “novel test compound” refers to a test compound that is not inexistence as of the filing date of this application. In certain assaysusing novel test compounds, the novel test compounds comprise at leastabout 50%, 75%, 85%, 90%, 95% or more of the test compounds used in theassay or in any particular trial of the assay.

The term “therapeutically effective amount” refers to that amount of amodulator, drug or other molecule which is sufficient to effecttreatment when administered to a subject in need of such treatment. Thetherapeutically effective amount will vary depending upon the subjectand disease condition being treated, the weight and age of the subject,the severity of the disease condition, the manner of administration andthe like, which can readily be determined by one of ordinary skill inthe art.

The term “transfection” means the introduction of a nucleic acid, e.g.,an expression vector, into a recipient cell, which in certain instancesinvolves nucleic acid-mediated gene transfer. The term “transformation”refers to a process in which a cell's genotype is changed as a result ofthe cellular uptake of exogenous nucleic acid. For example, atransformed cell may express a recombinant form of a dengue virus Eprotein or other class II E protein, or antisense expression may occurfrom the transferred gene so that the expression of anaturally-occurring form of the gene is disrupted.

The term “transgene” means a nucleic acid sequence, which is partly orentirely heterologous to a transgenic animal or cell into which it isintroduced, or, is homologous to an endogenous gene of the transgenicanimal or cell into which it is introduced, but which is designed to beinserted, or is inserted, into the animal's genome in such a way as toalter the genome of the cell into which it is inserted (e.g., it isinserted at a location which differs from that of the natural gene orits insertion results in a knockout). A transgene may include one ormore regulatory sequences and any other nucleic acids, such as introns,that may be necessary for optimal expression.

The term “transgenic animal” refers to any animal, for example, a mouse,rat or other non-human mammal, a bird or an amphibian, in which one ormore of the cells of the animal contain heterologous nucleic acidintroduced by way of human intervention, such as by transgenictechniques well known in the art. The nucleic acid is introduced intothe cell, directly or indirectly, by way of deliberate geneticmanipulation, such as by microinjection or by infection with arecombinant virus. The term genetic manipulation does not includeclassical cross-breeding, or in vitro fertilization, but rather isdirected to the introduction of a recombinant DNA molecule. Thismolecule may be integrated within a chromosome, or it may beextrachromosomally replicating DNA. In the typical transgenic animalsdescribed herein, the transgene causes cells to express a recombinantform of a protein. However, transgenic animals in which the recombinantgene is silent are also contemplated.

The term “vector” refers to a nucleic acid capable of transportinganother nucleic acid to which it has been linked. One type of vectorwhich may be used in accord with the invention is an episome, i.e., anucleic acid capable of extra-chromosomal replication. Other vectorsinclude those capable of autonomous replication and expression ofnucleic acids to which they are linked. Vectors capable of directing theexpression of genes to which they are operatively linked are referred toherein as “expression vectors”. In general, expression vectors ofutility in recombinant DNA techniques are often in the form of“plasmids” which refer to circular double stranded DNA molecules which,in their vector form are not bound to the chromosome. In the presentspecification, “plasmid” and “vector” are used interchangeably as theplasmid is the most commonly used form of vector. However, the inventionis intended to include such other forms of expression vectors whichserve equivalent functions and which become known in the artsubsequently hereto.

Unless otherwise indicated, all numbers expressing quantities ofingredients, reaction conditions, and so forth used in the specificationand claims are to be understood as being modified in all instances bythe term “about.” Accordingly, unless indicated to the contrary, thenumerical parameters set forth in this specification and attached claimsare approximations that may vary depending upon the desired propertiessought to be obtained by the present invention.

C. Drug Discovery

C.1. Druggable Regions

Based in part on the structural information described in theExemplification, we have identified novel druggable regions in denguevirus E protein. In one embodiment, the druggable region is comprised ofthe k1 hairpin or a portion thereof. In certain embodiments, the k1hairpin may be comprised of at least one of residues 268-280 of a denguevirus E protein or the homologous residues in other class II E protein.In other embodiments, the druggable region or active site region may becomprised of the k1 hairpin and at least one of residues 47-54, 128-137,and 187-207.

In yet another embodiment, the druggable region may comprise the regionsinvolved in the binding of residues 396-429 (the “stem” region of dengueenvelope protein E) binds to the trimeric, post-fusion form of denguevirus E protein or other flavivirus E protein. In one embodiment, thedruggable region is comprised of the stem region or a portion thereof.The stem region comprises residues 396-447, or fragments thereof, forexample 396-429 and 413-447. In another embodiment, the druggable regionis comprised of the channel in which the stem region binds. The channelis comprised of the residues at the trimer interface formed by domain IIof each subunit in the trimer. Domain II consists of residues 52-132 and193-280. A second region is the channel where the stem binds, formed byresidues in domain II.

In another embodiment, the druggable region is comprised of the domainI-II region. In certain embodiments, the domain I-III region may becomprised of at least one of residues 38-40; 143-147; 294-296; and354-365 of a dengue virus E protein or the homologous residues in otherclass II E protein. In other embodiments, the druggable region may becomprised of the domain I-domain III linker (residues 294-301).

In yet another embodiment, a druggable region is comprised of the fusionloop or a portion thereof.

Other regions of protein may in certain embodiments comprise a druggableregion. For example, the hydrophobic core beneath the k1 hairpin or aportion thereof may comprise a druggable region. In another example, adruggable region may comprise domain II or a portion thereof. In stillanother example, a druggable region may comprise domain III or a portionthereof. In other examples, the pH-dependent hinge may serve as adruggable region. Further, a region or portion of a region of the Eprotein involved in trimerization, such as for example, the regions ofdomain II involved in trimerization, may present a druggable region. Aregion or a portion of a region involved in the stem fold backconformational change may comprise a druggable region, for example, suchregions as the stem-domain II contact regions, the trimeric N terminalinner core, and C terminal outer layer surfaces on the clustered domainsII, as well as the 53-residue stem. In certain embodiments, a druggableregion may consist of the entire fragment of the E protein spanningresidues 1-395.

In yet another aspect, the present invention is directed toward methodsof identifying and designing modulators which bind with, interact with,or modulate the function or activity of an active or binding site of adengue virus E protein or other class II E protein.

C.2. Modulators, Modulator Design and Screening Using the SubjectDruggable Regions

In one aspect, the present invention provides methods of screening thesubject druggable regions for potential modulators, as well as methodsof designing such modulators. Modulators to polypeptides of theinvention and other structurally related molecules, and complexescontaining the same, may be identified and developed as set forth belowand otherwise using techniques and methods known to those of skill inthe art. The modulators of the invention may be employed, for instance,to inhibit and treat disease caused by a flavivirus or other virushaving class II E protein, such as dengue fever, dengue hemorrhagicfever, tick-borne encephalitis, West Nile virus disease, yellow fever,Kyasanur Forest disease, louping ill, hepatitis C, Ross River virusdisease, and O'nyong fever.

In one aspect, the present invention is directed towards a modulatorthat interacts with the subject druggable regions so as to reduce theactivity of the dengue virus E protein or other class II E protein. Suchmodulators may in certain embodiments interact with a druggable regionof the invention. In still another aspect, the present invention isdirected toward a modulator that is a fragment of (or homolog of suchfragment or mimetic of such fragment) the druggable region of a denguevirus E protein or other viral class II E protein and competes with thatdruggable region. Modulators of any of the above-described druggableregions may be used alone or in complementary approaches to treat dengueviral or other viral infections.

In certain embodiments, a modulator interacts with the k1 hairpin so asto preclude it from moving, thereby modulating the activity of thedengue virus E protein or other flavivirus E protein. In another aspect,the present invention is directed towards a modulator that interactswith the stem region or the channel so as to preclude them frominteracting, thereby modulating the activity of the dengue virus Eprotein or other flavivirus E protein. Such modulators may be, asdescribed above, derived from either the stem region or the channel, andcompete with the stem region or channel for binding. In still otherembodiments, a modulator of class II E protein activity interacts withthe domain I-III region. The modulator may also preclude the movement ofthe domain I-III region. In another aspect, the present invention isdirected towards a modulator that interacts with the fusion loop so asto preclude it from moving, thereby modulating the activity of thedengue virus E protein or other E protein.

Further, the present invention is in part directed toward an inhibitorthat comprises SEQ ID NO: 3 or SEQ ID NO: 4, as well as fragments,homologs, variants, orthologs, and peptidomimetics thereof. Further, thepresent invention is directed towards an inhibitor that interacts withthe relevant surfaces on the clustered domains II, so that completion ofthe conformational change is inhibited and thereby inhibiting theactivity of the dengue virus E protein or other E protein. The presentinvention is also directed towards an inhibitor that interacts with thepocket beneath the k1 hairpin to infere with the first stage of theconformational change, thereby modulating the activity of the denguevirus E protein or other E protein. Such inhibitors may be used incomplementary approaches to treat dengue viral or other viralinfections.

A variety of methods for inhibiting the growth or infectivity offlaviviruses using the modulators are contemplated by the presentinvention. For example, exemplary methods involve contacting aflavivirus with a modulator thought or shown to be effective againstsuch pathogen.

For example, in one aspect, the present invention contemplates a methodfor treating a patient suffering from an infection of dengue fever orother flavivirus comprising administering to the patient an amount of amodulator effective to modulate the expression and/or activity of adengue virus E protein or other class II E protein. In certaininstances, the animal is a human or a livestock animal such as a cow,pig, goat or sheep. The present invention further contemplates a methodfor treating a subject suffering from a flavivirus-related,alphavirus-related, or hepatitis-related disease or disorder, comprisingadministering to an animal having the condition a therapeuticallyeffective amount of a molecule identified using one of the methods ofthe present invention.

In another embodiment, modulators of a dengue virus E protein or otherclass II E protein, or biological complexes containing them, may be usedin the manufacture of a medicament for any number of uses, including,for example, treating any disease or other treatable condition of apatient (including humans and animals), and particularly a diseasecaused by a flavivirus or other virus having class II E protein, suchas, for example, one of the following: dengue fever, dengue hemorrhagicfever, tick-borne encephalitis, West Nile virus disease, yellow fever,Kyasanur Forest disease, louping ill, hepatitis C, Ross River virusdisease, and O'nyong fever.

-   -   (a) Modulator Design

A number of techniques can be used to screen, identify, select anddesign chemical entities capable of associating with a dengue virus Eprotein or other class II E protein, structurally homologous molecules,and other molecules. Knowledge of the structure for a dengue virus Eprotein or other class II E protein, determined in accordance with themethods described herein, permits the design and/or identification ofmolecules and/or other modulators which have a shape complementary tothe conformation of a dengue virus E protein or other class II Eprotein, or more particularly, a druggable region thereof. It isunderstood that such techniques and methods may use, in addition to theexact structural coordinates and other information for a dengue virus Eprotein or other class II E protein, structural equivalents thereofdescribed above (including, for example, those structural coordinatesthat are derived from the structural coordinates of amino acidscontained in a druggable region as described above).

The term “chemical entity,” as used herein, refers to chemicalcompounds, complexes of two or more chemical compounds, and fragments ofsuch compounds or complexes. In certain instances, it is desirable touse chemical entities exhibiting a wide range of structural andfunctional diversity, such as compounds exhibiting different shapes(e.g., flat aromatic rings(s), puckered aliphatic rings(s), straight andbranched chain aliphatics with single, double, or triple bonds) anddiverse functional groups (e.g., carboxylic acids, esters, ethers,amines, aldehydes, ketones, and various heterocyclic rings).

In one aspect, the method of drug design generally includescomputationally evaluating the potential of a selected chemical entityto associate with any of the molecules or complexes of the presentinvention (or portions thereof). For example, this method may includethe steps of (a) employing computational means to perform a fittingoperation between the selected chemical entity and a druggable region ofthe molecule or complex; and (b) analyzing the results of said fittingoperation to quantify the association between the chemical entity andthe druggable region.

A chemical entity may be examined either through visual inspection orthrough the use of computer modeling using a docking program such asGRAM, DOCK, or AUTODOCK (Dunbrack et al., Folding & Design, 2:27-42(1997)). This procedure can include computer fitting of chemicalentities to a target to ascertain how well the shape and the chemicalstructure of each chemical entity will complement or interfere with thestructure of a dengue virus E protein or other class II E protein (Bugget al., Scientific American, Dec.: 92-98 (1993); West et al., TIPS,16:67-74 (1995)). Computer programs may also be employed to estimate theattraction, repulsion, and steric hindrance of the chemical entity to adruggable region, for example. Generally, the tighter the fit (e.g., thelower the steric hindrance, and/or the greater the attractive force) themore potent the chemical entity will be because these properties areconsistent with a tighter binding constant. Furthermore, the morespecificity in the design of a chemical entity the more likely that thechemical entity will not interfere with related proteins, which mayminimize potential side-effects due to unwanted interactions.

A variety of computational methods for molecular design, in which thesteric and electronic properties of druggable regions are used to guidethe design of chemical entities, are known: Cohen et al. (1990) J. Med.Cam. 33: 883-894; Kuntz et al. (1982) J. Mol. Biol. 161: 269-288;DesJarlais (1988) J. Med. Cam. 31: 722-729; Bartlett et al. (1989) Spec.Publ., Roy. Soc. Chem. 78: 182-196; Goodford et al. (1985) J. Med. Cam.28: 849-857; and Desjarlais et al. J. Med. Cam. 29: 2149-2153. Directedmethods generally fall into two categories: (1) design by analogy inwhich 3-D structures of known chemical entities (such as from acrystallographic database) are docked to the druggable region and scoredfor goodness-of-fit; and (2) de novo design, in which the chemicalentity is constructed piece-wise in the druggable region. The chemicalentity may be screened as part of a library or a database of molecules.Databases which may be used include ACD (Molecular Designs Limited), NCI(National Cancer Institute), CCDC (Cambridge Crystallographic DataCenter), CAST (Chemical Abstract Service), Derwent (Derwent InformationLimited), Maybridge (Maybridge Chemical Company Ltd), Aldrich (AldrichChemical Company), DOCK (University of California in San Francisco), andthe Directory of Natural Products (Chapman & Hall). Computer programssuch as CONCORD (Tripos Associates) or DB-Converter (MolecularSimulations Limited) can be used to convert a data set represented intwo dimensions to one represented in three dimensions.

Chemical entities may be tested for their capacity to fit spatially witha druggable region or other portion of a target protein. As used herein,the term “fits spatially” means that the three-dimensional structure ofthe chemical entity is accommodated geometrically by a druggable region.A favorable geometric fit occurs when the surface area of the chemicalentity is in close proximity with the surface area of the druggableregion without forming unfavorable interactions. A favorablecomplementary interaction occurs where the chemical entity interacts byhydrophobic, aromatic, ionic, dipolar, or hydrogen donating andaccepting forces. Unfavorable interactions may be steric hindrancebetween atoms in the chemical entity and atoms in the druggable region.

If a model of the present invention is a computer model, the chemicalentities may be positioned in a druggable region through computationaldocking. If, on the other hand, the model of the present invention is astructural model, the chemical entities may be positioned in thedruggable region by, for example, manual docking. As used herein theterm “docking” refers to a process of placing a chemical entity in closeproximity with a druggable region, or a process of finding low energyconformations of a chemical entity/druggable region complex.

In an illustrative embodiment, the design of potential modulator beginsfrom the general perspective of shape complimentary for the druggableregion of a dengue virus E protein or other class II E protein, and asearch algorithm is employed which is capable of scanning a database ofsmall molecules of known three-dimensional structure for chemicalentities which fit geometrically with the target druggable region. Mostalgorithms of this type provide a method for finding a wide assortmentof chemical entities that are complementary to the shape of a druggableregion of a dengue virus E protein or other class II E protein. Each ofa set of chemical entities from a particular data-base, such as theCambridge Crystallographic Data Bank (CCDB) (Allen et al. (1973) J.Chem. Doc. 13: 119), is individually docked to the druggable region of adengue virus E protein or other class II E protein in a number ofgeometrically permissible orientations with use of a docking algorithm.In certain embodiments, a set of computer algorithms called DOCK, can beused to characterize the shape of invaginations and grooves that formthe active sites and recognition surfaces of the druggable region (Kuntzet al. (1982) J. Mol. Biol. 161: 269-288). The program can also search adatabase of small molecules for templates whose shapes are complementaryto particular binding sites of a dengue virus E protein or other classII E protein (DesJarlais et al. (1988) J Med Chem 31: 722-729).

The orientations are evaluated for goodness-of-fit and the best are keptfor further examination using molecular mechanics programs, such asAMBER or CHARMM. Such algorithms have previously proven successful infinding a variety of chemical entities that are complementary in shapeto a druggable region.

Goodford (1985, J Med Chem 28:849-857) and Boobbyer et al. (1989, J MedChem 32:1083-1094) have produced a computer program (GRID) which seeksto determine regions of high affinity for different chemical groups(termed probes) of the druggable region. GRID hence provides a tool forsuggesting modifications to known chemical entities that might enhancebinding. It may be anticipated that some of the sites discerned by GRIDas regions of high affinity correspond to “pharmacophoric patterns”determined inferentially from a series of known ligands. As used herein,a “pharmacophoric pattern” is a geometric arrangement of features ofchemical entities that is believed to be important for binding. Attemptshave been made to use pharmacophoric patterns as a search screen fornovel ligands (Jakes et al. (1987) J Mol Graph 5:41-48; Brint et al.(1987) J Graph 5:49-56; Jakes et al. (1986) J Mol Graph 4:12-20).

Yet a further embodiment of the present invention utilizes a computeralgorithm such as CLIX which searches such databases as CCDB forchemical entities which can be oriented with the druggable region in away that is both sterically acceptable and has a high likelihood ofachieving favorable chemical interactions between the chemical entityand the surrounding amino acid residues. The method is based oncharacterizing the region in terms of an ensemble of favorable bindingpositions for different chemical groups and then searching fororientations of the chemical entities that cause maximum spatialcoincidence of individual candidate chemical groups with members of theensemble. The algorithmic details of CLIX is described in Lawrence etal. (1992) Proteins 12:3141.

In this way, the efficiency with which a chemical entity may bind to orinterfere with a druggable region may be tested and optimized bycomputational evaluation. For example, for a favorable association witha druggable region, a chemical entity must preferably demonstrate arelatively small difference in energy between its bound and fine states(i.e., a small deformation energy of binding). Thus, certain, moredesirable chemical entities will be designed with a deformation energyof binding of not greater than about 10 kcal/mole, and more preferably,not greater than 7 kcal/mole. Chemical entities may interact with adruggable region in more than one conformation that is similar inoverall binding energy. In those cases, the deformation energy ofbinding is taken to be the difference between the energy of the freeentity and the average energy of the conformations observed when thechemical entity binds to the target.

In this way, the present invention provides computer-assisted methodsfor identifying or designing a potential modulator of the activity of adengue virus E protein or other class II E protein including: supplyinga computer modeling application with a set of structure coordinates of amolecule or complex, the molecule or complex including at least aportion of a druggable region from a dengue virus E protein or otherclass II E protein; supplying the computer modeling application with aset of structure coordinates of a chemical entity; and determiningwhether the chemical entity is expected to bind to the molecule orcomplex, wherein binding to the molecule or complex is indicative ofpotential modulation of the activity of a dengue virus E protein orother class II E protein.

In another aspect, the present invention provides a computer-assistedmethod for identifying or designing a potential modulator to a denguevirus E protein or other class II E protein, supplying a computermodeling application with a set of structure coordinates of a moleculeor complex, the molecule or complex including at least a portion of adruggable region of a dengue virus E protein or other class II Eprotein; supplying the computer modeling application with a set ofstructure coordinates for a chemical entity; evaluating the potentialbinding interactions between the chemical entity and active site of themolecule or molecular complex; structurally modifying the chemicalentity to yield a set of structure coordinates for a modified chemicalentity, and determining whether the modified chemical entity is expectedto bind to the molecule or complex, wherein binding to the molecule orcomplex is indicative of potential modulation of the dengue virus Eprotein or other class II E protein.

In one embodiment, a potential modulator can be obtained by screening apeptide or other compound or chemical library (Scott and Smith, Science,249:386-390 (1990); Cwirla et al., Proc. Natl. Acad. Sci., 87:6378-6382(1990); Devlin et al., Science, 249:404-406 (1990)). A potentialmodulator selected in this manner could then be systematically modifiedby computer modeling programs until one or more promising potentialdrugs are identified. Such analysis has been shown to be effective inthe development of HIV protease modulators (Lam et al., Science263:380-384 (1994); Wlodawer et al., Ann. Rev. Biochem. 62:543-585(1993); Appelt, Perspectives in Drug Discovery and Design 1:23-48(1993); Erickson, Perspectives in Drug Discovery and Design 1:109-128(1993)). Alternatively a potential modulator may be selected from alibrary of chemicals such as those that can be licensed from thirdparties, such as chemical and pharmaceutical companies. A thirdalternative is to synthesize the potential modulator de novo.

For example, in certain embodiments, the present invention provides amethod for making a potential modulator for a dengue virus E protein orother class II E protein, the method including synthesizing a chemicalentity or a molecule containing the chemical entity to yield a potentialmodulator of a dengue virus E protein or other class II E protein, thechemical entity having been identified during a computer-assistedprocess including supplying a computer modeling application with a setof structure coordinates of a molecule or complex, the molecule orcomplex including at least one druggable region from a dengue virus Eprotein or other class II E protein; supplying the computer modelingapplication with a set of structure coordinates of a chemical entity;and determining whether the chemical entity is expected to bind to themolecule or complex at the active site, wherein binding to the moleculeor complex is indicative of potential modulation. This method mayfurther include the steps of evaluating the potential bindinginteractions between the chemical entity and the active site of themolecule or molecular complex and structurally modifying the chemicalentity to yield a set of structure coordinates for a modified chemicalentity, which steps may be repeated one or more times.

Once a potential modulator is identified, it can then be tested in anystandard assay for the macromolecule depending of course on themacromolecule, including in high throughput assays. Further refinementsto the structure of the modulator will generally be necessary and can bemade by the successive iterations of any and/or all of the stepsprovided by the particular screening assay, in particular furtherstructural analysis by e.g., ¹⁵N NMR relaxation rate determinations orx-ray crystallography with the modulator bound to a dengue virus Eprotein or other class II E protein. These studies may be performed inconjunction with biochemical assays.

Once identified, a potential modulator may be used as a model structure,and analogs to the compound can be obtained. The analogs are thenscreened for their ability to bind to a dengue virus E protein or otherclass II E protein. An analog of the potential modulator might be chosenas a modulator when it binds to a dengue virus E protein or other classII E protein with a higher binding affinity than the predecessormodulator.

In a related approach, iterative drug design is used to identifymodulators of a target protein. Iterative drug design is a method foroptimizing associations between a protein and a modulator by determiningand evaluating the three dimensional structures of successive sets ofprotein/modulator complexes. In iterative drug design, crystals of aseries of protein/modulator complexes are obtained and then thethree-dimensional structures of each complex is solved. Such an approachprovides insight into the association between the proteins andmodulators of each complex. For example, this approach may beaccomplished by selecting modulators with modulatory activity, obtainingcrystals of this new protein/modulator complex, solving the threedimensional structure of the complex, and comparing the associationsbetween the new protein/modulator complex and previously solvedprotein/modulator complexes. By observing how changes in the modulatoraffected the protein/modulator associations, these associations may beoptimized.

In addition to designing and/or identifying a chemical entity toassociate with a druggable region, as described above, the sametechniques and methods may be used to design and/or identify chemicalentities that either associate, or do not associate, with affinityregions, selectivity regions or undesired regions of protein targets. Bysuch methods, selectivity for one or a few targets, or alternatively formultiple targets, from the same species or from multiple species, can beachieved.

For example, a chemical entity may be designed and/or identified forwhich the binding energy for one druggable region, e.g., an affinityregion or selectivity region, is more favorable than that for anotherregion, e.g., an undesired region, by about 20%, 30%, 50% to about 60%or more. It may be the case that the difference is observed between (a)more than two regions, (b) between different regions (selectivity,affinity or undesirable) from the same target, (c) between regions ofdifferent targets, (d) between regions of homologs from differentspecies, or (e) between other combinations. Alternatively, thecomparison may be made by reference to the Kd, usually the apparent Kd,of said chemical entity with the two or more regions in question.

In another aspect, prospective modulators are screened for binding totwo nearby druggable regions on a target protein. For example, amodulator that binds a first region of a target polypeptide does notbind a second nearby region. Binding to the second region can bedetermined by monitoring changes in a different set of amide chemicalshifts in either the original screen or a second screen conducted in thepresence of a modulator (or potential modulator) for the first region.From an analysis of the chemical shift changes, the approximate locationof a potential modulator for the second region is identified.Optimization of the second modulator for binding to the region is thencarried out by screening structurally related compounds (e.g., analogsas described above). When modulators for the first region and the secondregion are identified, their location and orientation in the ternarycomplex can be determined experimentally. On the basis of thisstructural information, a linked compound, e.g., a consolidatedmodulator, is synthesized in which the modulator for the first regionand the modulator for the second region are linked. In certainembodiments, the two modulators are covalently linked to form aconsolidated modulator. This consolidated modulator may be tested todetermine if it has a higher binding affinity for the target than eitherof the two individual modulators. A consolidated modulator is selectedas a modulator when it has a higher binding affinity for the target thaneither of the two modulators. Larger consolidated modulators can beconstructed in an analogous manner, e.g., linking three modulators whichbind to three nearby regions on the target to form a multilinkedconsolidated modulator that has an even higher affinity for the targetthan the linked modulator. In this example, it is assumed that isdesirable to have the modulator bind to all the druggable regions.However, it may be the case that binding to certain of the druggableregions is not desirable, so that the same techniques may be used toidentify modulators and consolidated modulators that show increasedspecificity based on binding to at least one but not all druggableregions of a target.

The present invention provides a number of methods that use drug designas described above. For example, in one aspect, the present inventioncontemplates a method for designing a candidate compound for screeningfor modulators of a dengue virus E protein or other class II E protein,the method comprising: (a) determining the three dimensional structureof a crystallized dengue virus E protein or other class II E protein ora fragment thereof; and (b) designing a candidate modulator based on thethree dimensional structure of the crystallized polypeptide or fragment.

In another aspect, the present invention contemplates a method foridentifying a potential modulator of a dengue virus E protein or otherclass II E protein, the method comprising: (a) providing thethree-dimensional coordinates of a dengue virus E protein or other classII E protein or a fragment thereof; (b) identifying a druggable regionof the polypeptide or fragment; and (c) selecting from a database atleast one compound that comprises three dimensional coordinates whichindicate that the compound may bind the druggable region; (d) whereinthe selected compound is a potential modulator of a dengue virus Eprotein or other class II E protein.

In another aspect, the present invention contemplates a method foridentifying a potential modulator of a molecule comprising a druggableregion similar to that of an E protein k1 hairpin, the methodcomprising: (a) using the atomic coordinates of amino acid residues froma druggable region, such as an E protein k1 hairpin, or a fragmentthereof, ± a root mean square deviation from the backbone atoms of theamino acids of not more than 1.5 Å, to generate a three-dimensionalstructure of a molecule comprising an E protein k1 hairpin-likedruggable region; (b) employing the three dimensional structure todesign or select the potential modulator; (c) synthesizing themodulator; and (d) contacting the modulator with the molecule todetermine the ability of the modulator to interact with the molecule.

In another aspect, the present invention contemplates an apparatus fordetermining whether a compound is a potential modulator of a denguevirus E protein or other class II E protein, the apparatus comprising:(a) a memory that comprises: (i) the three dimensional coordinates andidentities of the atoms of a dengue virus E protein or other class II Eprotein or a fragment thereof that form a druggable site, such as forexample, an E protein k1 hairpin; and (ii) executable instructions; and(b) a processor that is capable of executing instructions to: (i)receive three-dimensional structural information for a candidatecompound; (ii) determine if the three-dimensional structure of thecandidate compound is complementary to the structure of the interior ofthe druggable site; and (iii) output the results of the determination.

In another aspect, the present invention contemplates a method fordesigning a potential compound for the prevention or treatment of aflavivirus related disease or disorder, the method comprising: (a)providing the three dimensional structure of a crystallized dengue virusE protein or other class II E protein, or a fragment thereof; (b)synthesizing a potential compound for the prevention or treatment offlavivirus related disease or disorder based on the three dimensionalstructure of the crystallized polypeptide or fragment; (c) contacting adengue virus E protein or other class II E protein with the potentialcompound; and (d) assaying the activity of a dengue virus E protein orother class II E protein, wherein a change in the activity of thepolypeptide indicates that the compound may be useful for prevention ortreatment of a flavivirus related disease or disorder.

In another aspect, the present invention contemplates a method fordesigning a potential compound for the prevention or treatment offlavivirus related disease or disorder, the method comprising: (a)providing structural information of a druggable region derived from NMRspectroscopy of a dengue virus E protein or other class II E protein, ora fragment thereof; (b) synthesizing a potential compound for theprevention or treatment of flavivirus related disease or disorder basedon the structural information; (c) contacting a dengue virus E proteinor other class II E protein or a flavivirus with the potential compound;and (d) assaying the activity of a dengue virus E protein or other classII E protein, wherein a change in the activity of the polypeptideindicates that the compound may be useful for prevention or treatment ofa flavivirus related disease or disorder.

-   -   (b) Modulator Libraries

The synthesis and screening of combinatorial libraries is a validatedstrategy for the identification and study of organic molecules ofinterest. According to the present invention, the synthesis of librariescontaining molecules bind, interact with, or modulate theactivity/function of a subject druggable region may be performed usingestablished combinatorial methods for solution phase, solid phase, or acombination of solution phase and solid phase synthesis techniques. Thesynthesis of combinatorial libraries is well known in the art and hasbeen reviewed (see, e.g., “Combinatorial Chemistry”, Chemical andEngineering News, Feb. 24, 1997, p. 43; Thompson et al., Chem. Rev.(1996) 96:555). Many libraries are commercially available. One ofordinary skill in the art will realize that the choice of method for anyparticular embodiment will depend upon the specific number of moleculesto be synthesized, the specific reaction chemistry, and the availabilityof specific instrumentation, such as robotic instrumentation for thepreparation and analysis of the inventive libraries. In certainembodiments, the reactions to be performed to generate the libraries areselected for their ability to proceed in high yield, and in astereoselective and regioselective fashion, if applicable.

In one aspect of the present invention, the inventive libraries aregenerated using a solution phase technique. Traditional advantages ofsolution phase techniques for the synthesis of combinatorial librariesinclude the availability of a much wider range of reactions, and therelative ease with which products may be characterized, and readyidentification of library members, as discussed below. For example, incertain embodiments, for the generation of a solution phasecombinatorial library, a parallel synthesis technique is utilized, inwhich all of the products are assembled separately in their own reactionvessels. In a particular parallel synthesis procedure, a microtitreplate containing n rows and m columns of tiny wells which are capable ofholding a few milliliters of the solvent in which the reaction willoccur, is utilized. It is possible to then use n variants of reactant A,such as a ligand, and m variants of reactant B, such as a second ligand,to obtain n×m variants, in n×m wells. One of ordinary skill in the artwill realize that this particular procedure is most useful when smallerlibraries are desired, and the specific wells may provide a ready meansto identify the library members in a particular well.

In other embodiments of the present invention, a solid phase synthesistechnique is utilized. Solid phase techniques allow reactions to bedriven to completion because excess reagents may be utilized and theunreacted reagent washed away. Solid phase synthesis also allows the usea technique called “split and pool”, in addition to the parallelsynthesis technique, developed by Furka. See, e.g., Furka et al., Abstr.14th Int. Congr. Biochem., (Prague, Czechoslovakia) (1988) 5:47; Furkaet al., Int. J. Pept. Protein Res. (1991) 37:487; Sebestyen et al.,Bioorg. Med. Chem. Lett. (1993) 3:413. In this technique, a mixture ofrelated molecules may be made in the same reaction vessel, thussubstantially reducing the number of containers required for thesynthesis of very large libraries, such as those containing as many asor more than one million library members. As an example, the solidsupport with the starting material attached may be divided into nvessels, where n represents the number species of reagent A to bereacted with the such starting material. After reaction, the contentsfrom n vessels are combined and then split into m vessels, where mrepresents the number of species of reagent B to be reacted with the nowmodified starting materials. This procedure is repeated until thedesired number of reagents is reacted with the starting materials toyield the inventive library.

The use of solid phase techniques in the present invention may alsoinclude the use of a specific encoding technique. Specific encodingtechniques have been reviewed by Czarnik in Current Opinion in ChemicalBiology (1997) 1:60. One of ordinary skill in the art will also realizethat if smaller solid phase libraries are generated in specific reactionwells, such as 96 well plates, or on plastic pins, the reaction historyof these library members may also be identified by their spatialcoordinates in the particular plate, and thus are spatially encoded. Inother embodiments, an encoding technique involves the use of aparticular “identifying agent” attached to the solid support, whichenables the determination of the structure of a specific library memberwithout reference to its spatial coordinates. Examples of such encodingtechniques include, but are not limited to, spatial encoding techniques,graphical encoding techniques, including the “tea bag” method, chemicalencoding methods, and spectrophotometric encoding methods. One ofordinary skill in the art will realize that the particular encodingmethod to be used in the present invention must be selected based uponthe number of library members desired, and the reaction chemistryemployed.

In certain embodiments, molecules of the present invention may beprepared using solid support chemistry known in the art. For example,polypeptides having up to twenty amino acids or more may be generatedusing standard solid phase technology on commercially availableequipment (such as Advanced Chemtech multiple organic synthesizers). Incertain embodiments, a starting material or later reactant may beattached to the solid phase, through a linking unit, or directly, andsubsequently used in the synthesis of desired molecules. The choice oflinkage will depend upon the reactivity of the molecules and the solidsupport units and the stability of these linkages. Direct attachment tothe solid support via a linker molecule may be useful if it is desirednot to detach the library member from the solid support. For example,for direct on-bead analysis of biological activity, a strongerinteraction between the library member and the solid support may bedesirable. Alternatively, the use of a linking reagent may be useful ifmore facile cleavage of the inventive library members from the solidsupport is desired.

In regard to automation of the present subject methods, a variety ofinstrumentation may be used to allow for the facile and efficientpreparation of chemical libraries of the present invention, and methodsof assaying members of such libraries. In general, automation, as usedin reference to the synthesis and preparation of the subject chemicallibraries, involves having instrumentation complete one or more of theoperative steps that must be repeated a multitude of times because alibrary instead of a single molecule is being prepared. Examples ofautomation include, without limitation, having instrumentation completethe addition of reagents, the mixing and reaction of them, filtering ofreaction mixtures, washing of solids with solvents, removal and additionof solvents, and the like. Automation may be applied to any steps in areaction scheme, including those to prepare, purify and assay moleculesfor use in the compositions of the present invention.

There is a range of automation possible. For example, the synthesis ofthe subject libraries may be wholly automated or only partiallyautomated. If wholly automated, the subject library may be prepared bythe instrumentation without any human intervention after initiating thesynthetic process, other than refilling reagent bottles or monitoring orprogramming the instrumentation as necessary. Although synthesis of asubject library may be wholly automated, it may be necessary for thereto be human intervention for purification, identification, or the likeof the library members.

In contrast, partial automation of the synthesis of a subject libraryinvolves some robotic assistance with the physical steps of the reactionschema that gives rise to the library, such as mixing, stirring,filtering and the like, but still requires some human intervention otherthan just refilling reagent bottles or monitoring or programming theinstrumentation. This type of robotic automation is distinguished fromassistance provided by convention organic synthetic and biologicaltechniques because in partial automation, instrumentation stillcompletes one or more of the steps of any schema that is required to becompleted a multitude of times because a library of molecules is beingprepared.

In certain embodiments, the subject library may be prepared in multiplereaction vessels (e.g., microtitre plates and the like), and theidentity of particular members of the library may be determined by thelocation of each vessel. In other embodiments, the subject library maybe synthesized in solution, and by the use of deconvolution techniques,the identity of particular members may be determined.

In one aspect of the invention, the subject screening method may becarried out utilizing immobilized libraries. In certain embodiments, theimmobilized library will have the ability to bind to a microorganism asdescribed above. The choice of a suitable support will be routine to theskilled artisan. Important criteria may include that the reactivity ofthe support not interfere with the reactions required to prepare thelibrary. Insoluble polymeric supports include functionalized polymersbased on polystyrene, polystyrene/divinylbenzene copolymers, and thelike, including any of the particles described in section 4.3. It willbe understood that the polymeric support may be coated, grafted orotherwise bonded to other solid supports.

In another embodiment, the polymeric support may be provided byreversibly soluble polymers. Such polymeric supports includefunctionalized polymers based on polyvinyl alcohol or polyethyleneglycol (PEG). A soluble support may be made insoluble (e.g., may be madeto precipitate) by addition of a suitable inert nonsolvent. Oneadvantage of reactions performed using soluble polymeric supports isthat reactions in solution may be more rapid, higher yielding, and morecomplete than reactions that are performed on insoluble polymericsupports.

Once the synthesis of either a desired solution phase or solid supportbound template has been completed, the template is then available forfurther reaction to yield the desired solution phase or solid supportbound structure. The use of solid support bound templates enables theuse of more rapid split and pool techniques.

Characterization of the library members may be performed using standardanalytical techniques, such as mass spectrometry, Nuclear MagneticResonance Spectroscopy, including 195Pt and 1H NMR, chromatography (e.g,liquid etc.) and infra-red spectroscopy. One of ordinary skill in theart will realize that the selection of a particular analytical techniquewill depend upon whether the inventive library members are in thesolution phase or on the solid phase. In addition to suchcharacterization, the library member may be synthesized separately toallow for more ready identification.

-   -   (c) In Vitro Assays

Any form of dengue virus E protein or other class II E protein, e.g. afull-length polypeptide or a fragment comprising the target druggableregion, may be used to assess the activity of candidate small moleculesand other modulators in in vitro assays. In one embodiment of such anassay, agents are identified which modulate the biological activity of adruggable region, the protein-protein interaction of interest orformation of a protein complex involving a subject druggable region. Inanother embodiment of such an assay, agents are identified which bind orinteract with subject druggable region. In certain embodiments, the testagent is a small organic molecule. The candidate agents may be selected,for example, from the following classes of compounds: detergents,proteins, peptides, peptidomimetics, small molecules, cytokines, orhormones. In some embodiments, the candidate therapeutics may be in alibrary of compounds. These libraries may be generated usingcombinatorial synthetic methods as described above. In certainembodiments of the present invention, the ability of said candidatetherapeutics to bind a target gene or gene product may be evaluated byan in vitro assay. In either embodiments, discussed in the next section,the binding assay may also be in vivo.

The invention also provides a method of screening multiple compounds toidentify those which modulate the action of polypeptides of theinvention, or polynucleotides encoding the same. The method of screeningmay involve high-throughput techniques. For example, to screen formodulators, a synthetic reaction mix, a cellular compartment, such as amembrane, cell envelope or cell wall, or a preparation of any thereof, awhole cell or tissue, or even a whole organism comprising a dengue virusE protein or other class II E protein and a labeled substrate or ligandof such polypeptide is incubated in the absence or the presence of acandidate molecule that may be a modulator of a dengue virus E proteinor other class II E protein. The ability of the candidate molecule tomodulate a dengue virus E protein or other class II E protein isreflected in decreased binding of the labeled ligand or decreasedproduction of product from such substrate. Detection of the rate orlevel of production of product from substrate may be enhanced by using areporter system. Reporter systems that may be useful in this regardinclude but are not limited to calorimetric labeled substrate convertedinto product, a reporter gene that is responsive to changes in a nucleicacid of the invention or polypeptide activity, and binding assays knownin the art.

Another example of an assay for a modulator of a dengue virus E proteinor other class II E protein is a competitive assay that combines adengue virus E protein or other class II E protein and a potentialmodulator with molecules that bind to a dengue virus E protein or otherclass II E protein, recombinant molecules that bind to a dengue virus Eprotein or other class II E protein, natural substrates or ligands, orsubstrate or ligand mimetics, under appropriate conditions for acompetitive inhibition assay. Polypeptides of the invention can belabeled, such as by radioactivity or a colorimetric compound, such thatthe number of molecules of a dengue virus E protein or other class II Eprotein bound to a binding molecule or converted to product can bedetermined accurately to assess the effectiveness of the potentialmodulator.

A number of methods for identifying a molecule which modulates theactivity of a polypeptide are known in the art. For example, in one suchmethod, a dengue virus E protein or other class II E protein iscontacted with a test compound, and the activity of the dengue virus Eprotein or other class II E protein in the presence of the test compoundis determined, wherein a change in the activity of the dengue virus Eprotein or other class II E protein is indicative that the test compoundmodulates the activity of the dengue virus E protein or other class II Eprotein. In certain instances, the test compound agonizes the activityof the dengue virus E protein or other class II E protein, and in otherinstances, the test compound antagonizes the activity of the denguevirus E protein or other class II E protein.

In another example, a compound which modulates dengue virus E protein orother class II E protein dependent growth or infectivity of flavivirusmay be identified by (a) contacting a dengue virus E protein or otherclass II E protein with a test compound; and (b) determining theactivity of the polypeptide in the presence of the test compound,wherein a change in the activity of the polypeptide is indicative thatthe test compound may modulate the growth or infectivity of flavivirus.

In certain of the subject assays, to evaluate the results using thesubject compositions, comparisons may be made to known molecules, suchas one with a known binding affinity for the target. For example, aknown molecule and a new molecule of interest may be assayed. The resultof the assay for the subject complex will be of a type and of amagnitude that may be compared to result for the known molecule. To theextent that the subject complex exhibits a type of response in the assaythat is quantifiably different from that of the known molecule then theresult for such complex in the assay would be deemed a positive ornegative result. In certain assays, the magnitude of the response may beexpressed as a percentage response with the known molecule result, e.g.100% of the known result if they are the same.

As those skilled in the art will understand, based on the presentdescription, binding assays may be used to detect agents that bind apolypeptide. Cell-free assays may be used to identify molecules that arecapable of interacting with a polypeptide. In a preferred embodiment,cell-free assays for identifying such molecules are comprisedessentially of a reaction mixture containing a target and a testmolecule or a library of test molecules. A test molecule may be, e.g., aderivative of a known binding partner of the target, e.g., abiologically inactive peptide, or a small molecule. Agents to be testedfor their ability to bind may be produced, for example, by bacteria,yeast or other organisms (e.g. natural products), produced chemically(e.g. small molecules, including peptidomimetics), or producedrecombinantly. In certain embodiments, the test molecule is selectedfrom the group consisting of lipids, carbohydrates, peptides,peptidomimetics, peptide-nucleic acids (PNAs), proteins, smallmolecules, natural products, aptamers and oligonucleotides. In otherembodiments of the invention, the binding assays are not cell-free. In apreferred embodiment, such assays for identifying molecules that bind atarget comprise a reaction mixture containing a target microorganism anda test molecule or a library of test molecules.

In many candidate screening programs which test libraries of moleculesand natural extracts, high throughput assays are desirable in order tomaximize the number of molecules surveyed in a given period of time.Assays of the present invention which are performed in cell-freesystems, such as may be derived with purified or semi-purified proteinsor with lysates, are often preferred as “primary” screens in that theymay be generated to permit rapid development and relatively easydetection of binding between a target and a test molecule. Moreover, theeffects of cellular toxicity and/or bioavailability of the test moleculemay be generally ignored in the in vitro system, the assay instead beingfocused primarily on the ability of the molecule to bind the target.Accordingly, potential binding molecules may be detected in a cell-freeassay generated by constitution of functional interactions of interestin a cell lysate. In an alternate format, the assay may be derived as areconstituted protein mixture which, as described below, offers a numberof benefits over lysate-based assays.

In one aspect, the present invention provides assays that may be used toscreen for molecules that bind E protein druggable regions. In anexemplary binding assay, the molecule of interest is contacted with amixture generated from target cell surface polypeptides. Detection andquantification of expected binding from to a target polypeptide providesa means for determining the molecule's efficacy at binding the target.The efficacy of the molecule may be assessed by generating dose responsecurves from data obtained using various concentrations of the testmolecule. Moreover, a control assay may also be performed to provide abaseline for comparison. In the control assay, the formation ofcomplexes is quantitated in the absence of the test molecule.

Complex formation between a molecule and a target E protein ormicroorganism containing a class II E protein may be detected by avariety of techniques, many of which are effectively described above.For instance, modulation in the formation of complexes may bequantitated using, for example, detectably labeled proteins (e.g.radiolabeled, fluorescently labeled, or enzymatically labeled), byimmunoassay, or by chromatographic detection.

Accordingly, one exemplary screening assay of the present inventionincludes the steps of contacting a class II E protein or functionalfragment thereof with a test molecule or library of test molecules anddetecting the formation of complexes. For detection purposes, forexample, the molecule may be labeled with a specific marker and the testmolecule or library of test molecules labeled with a different marker.Interaction of a test molecule with a polypeptide or fragment thereofmay then be detected by determining the level of the two labels after anincubation step and a washing step. The presence of two labels after thewashing step is indicative of an interaction. Such an assay may also bemodified to work with a whole target cell.

An interaction between a class II E protein target and a molecule mayalso be identified by using real-time BIA (Biomolecular InteractionAnalysis, Pharmacia Biosensor AB) which detects surface plasmonresonance (SPR), an optical phenomenon. Detection depends on changes inthe mass concentration of macromolecules at the biospecific interface,and does not require any labeling of interactants. In one embodiment, alibrary of test molecules may be immobilized on a sensor surface, e.g.,which forms one wall of a micro-flow cell. A solution containing thetarget is then flowed continuously over the sensor surface. A change inthe resonance angle as shown on a signal recording, indicates that aninteraction has occurred. This technique is further described, e.g., inBIAtechnology Handbook by Pharmacia.

In a preferred embodiment, it will be desirable to immobilize the targetto facilitate separation of complexes from uncomplexed forms, as well asto accommodate automation of the assay. Binding of polypeptide to a testmolecule may be accomplished in any vessel suitable for containing thereactants. Examples include microtitre plates, test tubes, andmicro-centrifuge tubes. In one embodiment, a fusion protein may beprovided which adds a domain that allows the target to be bound to amatrix. For example, glutathione-S-transferase/polypeptide(GST/polypeptide) fusion proteins may be adsorbed onto glutathionesepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathionederivatized microtitre plates, which are then combined with a labeledtest molecule (e.g., S³⁵ labeled, P³³ labeled, and the like, and themixture incubated under conditions conducive to complex formation, e.g.at physiological conditions for salt and pH, though slightly morestringent conditions may be desired. Following incubation, the beads arewashed to remove any unbound label, and the matrix immobilized andradiolabel determined directly (e.g. beads placed in scintillant), or inthe supernatant after the complexes are subsequently dissociated.Alternatively, the complexes may be dissociated from the matrix,separated by SDS-PAGE, and the level of polypeptide or binding partnerfound in the bead fraction quantitated from the gel using standardelectrophoretic techniques such as described in the appended examples.The above techniques could also be modified in which the test moleculeis immobilized, and the labeled target is incubated with the immobilizedtest molecules. In one embodiment of the invention, the test moleculesare immobilized, optionally via a linker, to a particle of theinvention, e.g. to create the ultimate composition.

Other techniques for immobilizing targets or molecules on matrices maybe used in the subject assays. For instance, a target or molecule may beimmobilized utilizing conjugation of biotin and streptavidin. Forinstance, biotinylated polypeptide molecules may be prepared frombiotin-NHS(N-hydroxy-succinimide) using techniques well known in the art(e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), andimmobilized in the wells of streptavidin-coated 96 well plates (PierceChemical). Alternatively, antibodies reactive with a target or moleculemay be derivatized to the wells of the plate, and the target or moleculetrapped in the wells by antibody conjugation. As above, preparations oftest molecules are incubated in the polypeptide presenting wells of theplate, and the amount of complex trapped in the well may be quantitated.Exemplary methods for detecting such complexes, in addition to thosedescribed above for the GST-immobilized complexes, includeimmunodetection of complexes using antibodies reactive with the complex,or which are reactive with one of the complex components; as well asenzyme-linked assays which rely on detecting an enzymatic activityassociated with a target or molecule, either intrinsic or extrinsicactivity. In an instance of the latter, the enzyme may be chemicallyconjugated or provided as a fusion protein with the target or molecule.To illustrate, a target polypeptide may be chemically cross-linked orgenetically fused with horseradish peroxidase, and the amount ofpolypeptide trapped in a complex with a molecule may be assessed with achromogenic substrate of the enzyme, e.g. 3,3′-diamino-benzadineterahydrochloride or 4-chloro-1-napthol. Likewise, a fusion proteincomprising the polypeptide and glutathione-S-transferase may beprovided, and complex formation quantitated by detecting the GSTactivity using 1-chloro-2,4-dinitrobenzene (Habig et al (1974) J BiolChem 249:7130).

For processes that rely on immunodetection for quantitating one of thecomponents trapped in a complex, antibodies against a component, such asanti-polypeptide antibodies, may be used. Alternatively, the componentto be detected in the complex may be “epitope tagged” in the form of afusion protein which includes, in addition to the polypeptide sequence,a second polypeptide for which antibodies are readily available (e.g.from commercial sources). For instance, the GST fusion proteinsdescribed above may also be used for quantification of binding usingantibodies against the GST moiety. Other useful epitope tags includemyc-epitopes (e.g., see Ellison et al. (1991) J Biol Chem266:21150-21157) which includes a 10-residue sequence from c-myc, aswell as the pFLAG system (International Biotechnologies, Inc.) or thepEZZ-protein A system (Pharmacia, N.J.).

In certain in vitro embodiments of the present assay, the solutioncontaining the target comprises a reconstituted protein mixture of atleast semi-purified proteins. By semi-purified, it is meant that thecomponents utilized in the reconstituted mixture have been previouslyseparated from other cellular or viral proteins. For instance, incontrast to cell lysates, a target protein is present in the mixture toat least 50% purity relative to all other proteins in the mixture, andmore preferably are present at 90-95% purity. In certain embodiments ofthe subject method, the reconstituted protein mixture is derived bymixing highly purified proteins such that the reconstituted mixturesubstantially lacks other proteins (such as of cellular or viral origin)which might interfere with or otherwise alter the ability to measurebinding activity. In one embodiment, the use of reconstituted proteinmixtures allows more careful control of the target:molecule interactionconditions.

In still other embodiments of the present invention, variations of viralfusion or viral infectivity assays may be utilized in order to determinethe ability of a test molecule to prevent a virus expressing type II Eprotein from binding to, fusing with, or infecting cells. If fusion,binding, or infecting is prevented, then the molecule or composition maybe useful as a therapeutic agent.

All of the screening methods may be accomplished by using a variety ofassay formats. In light of the present disclosure, those not expresslydescribed herein will nevertheless be known and comprehended by one ofordinary skill in the art. Assay formats which approximate suchconditions as formation of protein complexes or protein-nucleic acidcomplexes, and enzymatic activity may be generated in many differentforms, as those skilled in the art will appreciate based on the presentdescription and include but are not limited to assays based on cell-freesystems, e.g. purified proteins or cell lysates, as well as cell-basedassays which utilize intact cells. Assaying binding resulting from agiven target:molecule interaction may be accomplished in any vesselsuitable for containing the reactants. Examples include microtitreplates, test tubes, and micro-centrifuge tubes. Any of the assays may beprovided in kit format and may be automated. Many of the followingparticularized assays rely on general principles, such as blockage orprevention of fusion, that may apply to other particular assays.

-   -   (d) In Vivo Assays

Animal models of viral infection and/or disease may be used as an invivo assay for evaluating the effectiveness of a potential drug targetin treating or preventing flavivirus related diseases or disorders. Anumber of suitable animal models are described briefly below, however,these models are only examples and modifications, or completelydifferent animal models, may be used in accord with the methods of theinvention. Animal models may be developed by methods known in the art,for example, by infecting an animal with dengue fever or anotherflavivirus, or by genetically engineering an animal to be predisposed tosuch infection (see, e.g., Wu, S.-J. L. et al. Evaluation of the severecombined immunodeficient (SCID) mouse as an animal model for dengueviral infection. Am. J. Trop. Med. Hyg. 52, 468-476 (1995)).

Further, viral infectivity assays may be used as in vivo assays toassess the effectiveness of a potential drug target in treating orpreventing flavivirus related diseases or disorders. For example, theplaque assays described in Diamond et al (2000) J Virol 74:4957-4966 maybe used to assess by analyzing virion production whether an agent maymodulate infectivity of Dengue virus. Other assays, such as competitive,asymmetric reverse transcriptase-mediated PCR (RT-PCR) assays and flowcytometric assays that measure viral antigen, also described in Diamond,et al, may be used to assess the effectiveness of a potential drugtarget.

Still further, further, cell-cell fusion assays may be used as in vivoassays to assess the effectiveness of a potential drug target intreating or preventing flavivirus related diseases or disorders. Forexample, a cell-cell fusion assay in which the cell membrane fusionactivity of dengue virus may be analyzed is described in Despres et al(1993) Virology 196:209-219.

A variety of other in vivo models are available and may be used whenappropriate for specific pathogens or specific test agents.

It is also relevant to note that the species of animal used for aninfection model, and the specific genetic make-up of that animal, maycontribute to the effective evaluation of the effects of a particulartest agent. For example, immuno-incompetent animals may, in someinstances, be preferable to immuno-competent animals. For example, theaction of a competent immune system may, to some degree, mask theeffects of the test agent as compared to a similar infection in animmuno-incompetent animal. In addition, many opportunistic infections,in fact, occur in immuno-compromised patients, so modeling an infectionin a similar immunological environment is appropriate.

E. Pharmaceutical Compositions

Pharmaceutical compositions of this invention include any modulatoridentified according to the present invention, or a pharmaceuticallyacceptable salt thereof, and a pharmaceutically acceptable carrier,adjuvant, or vehicle. The term “pharmaceutically acceptable carrier”refers to a carrier(s) that is “acceptable” in the sense of beingcompatible with the other ingredients of a composition and notdeleterious to the recipient thereof.

Methods of making and using such pharmaceutical compositions are alsoincluded in the invention. The pharmaceutical compositions of theinvention can be administered orally, parenterally, by inhalation spray,topically, rectally, nasally, buccally, vaginally, or via an implantedreservoir. The term parenteral as used herein includes subcutaneous,intracutaneous, intravenous, intramuscular, intra articular,intrasynovial, intrasternal, intrathecal, intralesional, andintracranial injection or infusion techniques.

Dosage levels of between about 0.01 and about 100 mg/kg body weight perday, preferably between about 0.5 and about 75 mg/kg body weight per dayof the modulators described herein are useful for the prevention andtreatment of disease and conditions, including diseases and conditionsmediated by pathogenic species of origin for the polypeptides of theinvention. The amount of active ingredient that may be combined with thecarrier materials to produce a single dosage form will vary dependingupon the host treated and the particular mode of administration. Atypical preparation will contain from about 5% to about 95% activecompound (w/w). Alternatively, such preparations contain from about 20%to about 80% active compound.

G. Kits

The present invention provides kits for treating dengue fever and otherflaviviral infections. For example, a kit may comprise compositionscomprising compounds identified herein as modulators of dengue virus Eprotein or other class II E protein. The compositions may bepharmaceutical compositions comprising a pharmaceutically acceptableexcipient. In other embodiments involving kits, this inventioncontemplates a kit including compositions of the present invention, andoptionally instructions for their use. Kit components may be packagedfor either manual or partially or wholly automated practice of theforegoing methods. Such kits may have a variety of uses, including, forexample, imaging, diagnosis, therapy, and other applications.

H. Further Characterization of Dengue Virus E Protein or OtherFlavivirus E Protein Druggable Regions and Complexes of the Same

H.1 Analysis of Proteins by X-ray Crystallography

(i) X-ray Structure Determination

Exemplary methods for obtaining the three dimensional structure of thecrystalline form of a molecule or complex are described herein and, inview of this specification, variations on these methods will be apparentto those skilled in the art (see Ducruix and Geige 1992, IRL Press,Oxford, England).

A variety of methods involving x-ray crystallography are contemplated bythe present invention. For example, the present invention contemplatesproducing a dengue virus E protein or other class II E protein, or afragment or a complex, such as a trimer, thereof, by: (a) introducinginto a host cell an expression vector comprising a nucleic acid encodingfor a dengue virus E protein or other class II E protein, or a fragmentthereof; (b) culturing the host cell in a cell culture medium to expressthe protein or fragment; (c) isolating the protein or fragment from thecell culture; and (d) crystallizing the protein or fragment thereof.Optionally, said E protein may be complexed with a molecule or another Eprotein prior to crystallization. Alternatively, the present inventioncontemplates determining the three dimensional structure of acrystallized dengue virus E protein or other class II E protein, or afragment thereof, by: (a) crystallizing a dengue virus E protein orother class II E protein, or a fragment thereof, such that the crystalswill diffract x-rays to a resolution of 3.5 Å or better; and (b)analyzing the polypeptide or fragment by x-ray diffraction to determinethe three-dimensional structure of the crystallized polypeptide.

X-ray crystallography techniques generally require that the proteinmolecules be available in the form of a crystal. Crystals may be grownfrom a solution containing a purified dengue virus E protein or otherclass II E protein, or a fragment thereof (e.g., a stable domain), by avariety of conventional processes. These processes include, for example,batch, liquid, bridge, dialysis, vapour diffusion (e.g., hanging drop orsitting drop methods). (See for example, McPherson, 1982 John Wiley, NewYork; McPherson, 1990, Eur. J. Biochem. 189: 1-23; Webber. 1991, Adv.Protein Chem. 41:1-36).

In certain embodiments, native crystals of the invention may be grown byadding precipitants to the concentrated solution of the polypeptide. Theprecipitants are added at a concentration just below that necessary toprecipitate the protein. Water may be removed by controlled evaporationto produce precipitating conditions, which are maintained until crystalgrowth ceases.

The formation of crystals is dependent on a number of differentparameters, including pH, temperature, protein concentration, the natureof the solvent and precipitant, as well as the presence of added ions orligands to the protein. In addition, the sequence of the polypeptidebeing crystallized will have a significant affect on the success ofobtaining crystals. Many routine crystallization experiments may beneeded to screen all these parameters for the few combinations thatmight give crystal suitable for x-ray diffraction analysis (See, forexample, Jancarik, J & Kim, S. H., J. Appl. Cryst. 1991 24: 409-411).

Crystallization robots may automate and speed up the work ofreproducibly setting up large number of crystallization experiments.Once some suitable set of conditions for growing the crystal are found,variations of the condition may be systematically screened in order tofind the set of conditions which allows the growth of sufficientlylarge, single, well ordered crystals. In certain instances, a denguevirus E protein or other class II E protein is co-crystallized with acompound that stabilizes the polypeptide.

A number of methods are available to produce suitable radiation forx-ray diffraction. For example, x-ray beams may be produced bysynchrotron rings where electrons (or positrons) are accelerated throughan electromagnetic field while traveling at close to the speed of light.Because the admitted wavelength may also be controlled, synchrotrons maybe used as a tunable x-ray source (Hendrickson W A., Trends Biochem Sci2000 December; 25(12):637-43). For less conventional Laue diffractionstudies, polychromatic x-rays covering a broad wavelength window areused to observe many diffraction intensities simultaneously (Stoddard,B. L., Curr. Opin. Struct Biol 1998 October; 8(5):612-8). Neutrons mayalso be used for solving protein crystal structures (Gutberlet T,Heinemann U & Steiner M., Acta Crystallogr D 2001; 57: 349-54).

Before data collection commences, a protein crystal may be frozen toprotect it from radiation damage. A number of different cryo-protectantsmay be used to assist in freezing the crystal, such as methylpentanediol (MPD), isopropanol, ethylene glycol, glycerol, formate,citrate, mineral oil, or a low-molecular-weight polyethylene glycol(PEG). The present invention contemplates a composition comprising adengue virus E protein or other class II E protein and acryo-protectant. As an alternative to freezing the crystal, the crystalmay also be used for diffraction experiments performed at temperaturesabove the freezing point of the solution. In these instances, thecrystal may be protected from drying out by placing it in a narrowcapillary of a suitable material (generally glass or quartz) with someof the crystal growth solution included in order to maintain vapourpressure.

X-ray diffraction results may be recorded by a number of ways known toone of skill in the art. Examples of area electronic detectors includecharge coupled device detectors, multi-wire area detectors andphosphoimager detectors (Amemiya, Y, 1997. Methods in Enzymology, Vol.276. Academic Press, San Diego, pp. 233-243; Westbrook, E. M., Naday, I.1997. Methods in Enzymology, Vol. 276. Academic Press, San Diego, pp.244-268; 1997. Kahn, R. & Fourme, R. Methods in Enzymology, Vol. 276.Academic Press, San Diego, pp. 268-286).

A suitable system for laboratory data collection might include a BrukerAXS Proteum R system, equipped with a copper rotating anode source,Confocal Max-Flux™ optics and a SMART 6000 charge coupled devicedetector. Collection of x-ray diffraction patterns are well documentedby those skilled in the art (See, for example, Ducruix and Geige, 1992,IRL Press, Oxford, England).

The theory behind diffraction by a crystal upon exposure to x-rays iswell known. Because phase information is not directly measured in thediffraction experiment, and is needed to reconstruct the electrondensity map, methods that can recover this missing information arerequired. One method of solving structures ab initio are thereal/reciprocal space cycling techniques. Suitable real/reciprocal spacecycling search programs include shake-and-bake (Weeks C M, DeTitta G T,Hauptman H A, Thuman P, Miller R Acta Crystallogr A 1994; V50: 210-20).

Other methods for deriving phases may also be needed. These techniquesgenerally rely on the idea that if two or more measurements of the samereflection are made where strong, measurable, differences areattributable to the characteristics of a small subset of the atomsalone, then the contributions of other atoms can be, to a firstapproximation, ignored, and positions of these atoms may be determinedfrom the difference in scattering by one of the above techniques.Knowing the position and scattering characteristics of those atoms, onemay calculate what phase the overall scattering must have had to producethe observed differences.

One version of this technique is isomorphous replacement technique,which requires the introduction of new, well ordered, x-ray scatterersinto the crystal. These additions are usually heavy metal atoms, (sothat they make a significant difference in the diffraction pattern); andif the additions do not change the structure of the molecule or of thecrystal cell, the resulting crystals should be isomorphous. Isomorphousreplacement experiments are usually performed by diffusing differentheavy-metal metals into the channels of a pre-existing protein crystal.Growing the crystal from protein that has been soaked in the heavy atomis also possible (Petsko, G. A., 1985. Methods in Enzymology, Vol. 114.Academic Press, Orlando, pp. 147-156). Alternatively, the heavy atom mayalso be reactive and attached covalently to exposed amino acid sidechains (such as the sulfur atom of cysteine) or it may be associatedthrough non-covalent interactions. It is sometimes possible to replaceendogenous light metals in metallo-proteins with heavier ones, e.g.,zinc by mercury, or calcium by samarium (Petsko, G. A., 1985. Methods inEnzymology, Vol. 114. Academic Press, Orlando, pp. 147-156). Exemplarysources for such heavy compounds include, without limitation, sodiumbromide, sodium selenate, trimethyl lead acetate, mercuric chloride,methyl mercury acetate, platinum tetracyanide, platinum tetrachloride,nickel chloride, and europium chloride.

A second technique for generating differences in scattering involves thephenomenon of anomalous scattering. X-rays that cause the displacementof an electron in an inner shell to a higher shell are subsequentlyrescattered, but there is a time lag that shows up as a phase delay.This phase delay is observed as a (generally quite small) difference inintensity between reflections known as Friedel mates that would beidentical if no anomalous scattering were present. A second effectrelated to this phenomenon is that differences in the intensity ofscattering of a given atom will vary in a wavelength dependent manner,given rise to what are known as dispersive differences. In principleanomalous scattering occurs with all atoms, but the effect is strongestin heavy atoms, and may be maximized by using x-rays at a wavelengthwhere the energy is equal to the difference in energy between shells.The technique therefore requires the incorporation of some heavy atommuch as is needed for isomorphous replacement, although for anomalousscattering a wider variety of atoms are suitable, including lightermetal atoms (copper, zinc, iron) in metallo-proteins. One method forpreparing a protein for anomalous scattering involves replacing themethionine residues in whole or in part with selenium containingseleno-methionine. Soaks with halide salts such as bromides and othernon-reactive ions may also be effective (Dauter Z, Li M, Wlodawer A.,Acta Crystallogr D 2001; 57: 239-49).

In another process, known as multiple anomalous scattering or MAD, twoto four suitable wavelengths of data are collected. (Hendrickson, W. A.and Ogata, C. M. 1997 Methods in Enzymology 276, 494-523). Phasing byvarious combinations of single and multiple isomorphous and anomalousscattering are possible too. For example, SIRAS (single isomorphousreplacement with anomalous scattering) utilizes both the isomorphous andanomalous differences for one derivative to derive phases. Moretraditionally, several different heavy atoms are soaked into differentcrystals to get sufficient phase information from isomorphousdifferences while ignoring anomalous scattering, in the technique knownas multiple isomorphous replacement (MIR) (Petsko, G. A., 1985. Methodsin Enzymology, Vol. 114. Academic Press, Orlando, pp. 147-156).

Additional restraints on the phases may be derived from densitymodification techniques. These techniques use either generally knownfeatures of electron density distribution or known facts about thatparticular crystal to improve the phases. For example, because proteinregions of the crystal scatter more strongly than solvent regions,solvent flattening/flipping may be used to adjust phases to make solventdensity a uniform flat value (Zhang, K. Y. J., Cowtan, K. and Main, P.Methods in Enzymology 277, 1997 Academic Press, Orlando pp 53-64). Ifmore than one molecule of the protein is present in the asymmetric unit,the fact that the different molecules should be virtually identical maybe exploited to further reduce phase error using non-crystallographicsymmetry averaging (Villieux, F. M. D. and Read, R. J. Methods inEnzymology 277, 1997 Academic Press, Orlando pp 18-52). Suitableprograms for performing these processes include DM and other programs ofthe CCP4 suite (Collaborative Computational Project, Number 4. 1994.Acta Cryst. D50, 760-763) and CNX.

The unit cell dimensions, symmetry, vector amplitude and derived phaseinformation can be used in a Fourier transform function to calculate theelectron density in the unit cell, i.e., to generate an experimentalelectron density map. This may be accomplished using programs of the CNXor CCP4 packages. The resolution is measured in Ångstrom (Å) units, andis closely related to how far apart two objects need to be before theycan be reliably distinguished. The smaller this number is, the higherthe resolution and therefore the greater the amount of detail that canbe seen. Preferably, crystals of the invention diffract x-rays to aresolution of better than about 4.0, 3.5, 3.0, 2.5, 2.0, 1.5, 1.0, 0.5 Åor better.

As used herein, the term “modeling” includes the quantitative andqualitative analysis of molecular structure and/or function based onatomic structural information and interaction models. The term“modeling” includes conventional numeric-based molecular dynamic andenergy minimization models, interactive computer graphic models,modified molecular mechanics models, distance geometry and otherstructure-based constraint models.

Model building may be accomplished by either the crystallographer usinga computer graphics program such as TURBO or 0 (Jones, TA. et al., ActaCrystallogr. A47, 100-119, 1991) or, under suitable circumstances, byusing a fully automated model building program, such as wARP (AnastassisPerrakis, Richard Morris & Victor S. Lamzin; Nature Structural Biology,May 1999 Volume 6 Number 5 pp 458-463) or MAID (Levitt, D. G., ActaCrystallogr. D 2001 V57: 1013-9). This structure may be used tocalculate model-derived diffraction amplitudes and phases. Themodel-derived and experimental diffraction amplitudes may be comparedand the agreement between them can be described by a parameter referredto as R-factor. A high degree of correlation in the amplitudescorresponds to a low R-factor value, with 0.0 representing exactagreement and 0.59 representing a completely random structure. Becausethe R-factor may be lowered by introducing more free parameters into themodel, an unbiased, cross-correlated version of the R-factor known asthe R-free gives a more objective measure of model quality. For thecalculation of this parameter a subset of reflections (generally around10%) are set aside at the beginning of the refinement and not used aspart of the refinement target. These reflections are then compared tothose predicted by the model (Kleywegt G J, Brunger A T, Structure 1996Aug. 15; 4(8):897-904).

The model may be improved using computer programs that maximize theprobability that the observed data was produced from the predictedmodel, while simultaneously optimizing the model geometry. For example,the CNX program may be used for model refinement, as can the XPLORprogram (1992, Nature 355:472-475, G. N. Murshudov, A. A. Vagin and E.J. Dodson, (1997) Acta Cryst. D 53, 240-255). In order to maximize theconvergence radius of refinement, simulated annealing refinement usingtorsion angle dynamics may be employed in order to reduce the degrees offreedom of motion of the model (Adams P D, Pannu N S, Read R J, BrungerA T., Proc Natl Acad Sci U S A 1997 May 13; 94(10):5018-23). Whereexperimental phase information is available (e.g. where MAD data wascollected) Hendrickson-Lattman phase probability targets may beemployed. Isotropic or anisotropic domain, group or individualtemperature factor refinement, may be used to model variance of theatomic position from its mean. Well defined peaks of electron densitynot attributable to protein atoms are generally modeled as watermolecules. Water molecules may be found by manual inspection of electrondensity maps, or with automatic water picking routines. Additional smallmolecules, including ions, cofactors, buffer molecules or substrates maybe included in the model if sufficiently unambiguous electron density isobserved in a map.

In general, the R-free is rarely as low as 0.15 and may be as high as0.35 or greater for a reasonably well-determined protein structure. Theresidual difference is a consequence of approximations in the model(inadequate modeling of residual structure in the solvent, modelingatoms as isotropic Gaussian spheres, assuming all molecules areidentical rather than having a set of discrete conformers, etc.) anderrors in the data (Lattman E E., Proteins 1996; 25: i-ii). In refinedstructures at high resolution, there are usually no major errors in theorientation of individual residues, and the estimated errors in atomicpositions are usually around 0.1-0.2 up to 0.3 Å.

The three dimensional structure of a new crystal may be modeled usingmolecular replacement. The term “molecular replacement” refers to amethod that involves generating a preliminary model of a molecule orcomplex whose structure coordinates are unknown, by orienting andpositioning a molecule whose structure coordinates are known within theunit cell of the unknown crystal, so as best to account for the observeddiffraction pattern of the unknown crystal. Phases may then becalculated from this model and combined with the observed amplitudes togive an approximate Fourier synthesis of the structure whose coordinatesare unknown. This, in turn, can be subject to any of the several formsof refinement to provide a final, accurate structure of the unknowncrystal. Lattman, E., “Use of the Rotation and Translation Functions”,in Methods in Enzymology, 115, pp. 55-77 (1985); M. G. Rossmann, ed.,“The Molecular Replacement Method”, Int. Sci. Rev. Ser., No. 13, Gordon& Breach, New York, (1972).

Commonly used computer software packages for molecular replacement areCNX, X-PLOR (Brunger 1992, Nature 355: 472-475), AMoRE (Navaza, 1994,Acta Crystallogr. A50:157-163), the CCP4 package, the MERLOT package (P.M. D. Fitzgerald, J. Appl. Cryst., Vol. 21, pp. 273-278, 1988) andXTALVIEW (McCree et al (1992) J. Mol. Graphics 10: 44-46). The qualityof the model may be analyzed using a program such as PROCHECK or3D-Profiler (Laskowski et al 1993 J. Appl. Cryst. 26:283-291; Luthy R.et al, Nature 356: 83-85, 1992; and Bowie, J. U. et al, Science 253:164-170, 1991).

Homology modeling (also known as comparative modeling or knowledge-basedmodeling) methods may also be used to develop a three dimensional modelfrom a polypeptide sequence based on the structures of known proteins.The method utilizes a computer model of a known protein, a computerrepresentation of the amino acid sequence of the polypeptide with anunknown structure, and standard computer representations of thestructures of amino acids. This method is well known to those skilled inthe art (Greer, 1985, Science 228, 1055; Bundell et al 1988, Eur. J.Biochem. 172, 513; Knighton et al., 1992, Science 258:130-135,http://biochem.vt.edu/courses/-modeling/homology.htn). Computer programsthat can be used in homology modeling are QUANTA and the Homology modulein the Insight II modeling package distributed by Molecular SimulationsInc, or MODELLER (Rockefeller University,www.iucr.ac.uk/sinris-top/logical/prg-modeller.html).

Once a homology model has been generated it is analyzed to determine itscorrectness. A computer program available to assist in this analysis isthe Protein Health module in QUANTA which provides a variety of tests.Other programs that provide structure analysis along with output includePROCHECK and 3D-Profiler (Luthy R. et al, Nature 356: 83-85, 1992; andBowie, J. U. et al, Science 253: 164-170, 1991). Once any irregularitieshave been resolved, the entire structure may be further refined.

Other molecular modeling techniques may also be employed in accordancewith this invention. See, e.g., Cohen, N. C. et al, J. Med. Chem., 33,pp. 883-894 (1990). See also, Navix, M. A. and M. A. Marko, CurrentOpinions in Structural Biology, 2, pp. 202-210 (1992).

Under suitable circumstances, the entire process of solving a crystalstructure may be accomplished in an automated fashion by a system suchas ELVES (http://ucxray.berkeley.edu/˜jamesh/elves/index.html) withlittle or no user intervention.

(ii) X-ray Structure

The present invention provides methods for determining some or all ofthe structural coordinates for amino acids of a dengue virus E proteinor other class II E protein, or a complex thereof.

In another aspect, the present invention provides methods foridentifying a druggable region of a dengue virus E protein or otherclass II E protein. For example, one such method includes: (a) obtainingcrystals of a dengue virus E protein or other class II E protein or acomplex or a fragment thereof such that the three dimensional structureof the crystallized protein can be determined to a resolution of 3.5 Åor better; (b) determining the three dimensional structure of thecrystallized polypeptide or fragment using x-ray diffraction; and (c)identifying a druggable region of a dengue virus E protein or otherclass II E protein based on the three-dimensional structure of thepolypeptide or fragment.

A three dimensional structure of a molecule or complex may be describedby the set of atoms that best predict the observed diffraction data(that is, which possesses a minimal R value). Files may be created forthe structure that defines each atom by its chemical identity, spatialcoordinates in three dimensions, root mean squared deviation from themean observed position and fractional occupancy of the observedposition.

Those of skill in the art understand that a set of structure coordinatesfor an protein, complex or a portion thereof, is a relative set ofpoints that define a shape in three dimensions. Thus, it is possiblethat an entirely different set of coordinates could define a similar oridentical shape. Moreover, slight variations in the individualcoordinates may have little affect on overall shape. Such variations incoordinates may be generated because of mathematical manipulations ofthe structure coordinates. For example, structure coordinates could bemanipulated by crystallographic permutations of the structurecoordinates, fractionalization of the structure coordinates, integeradditions or subtractions to sets of the structure coordinates,inversion of the structure coordinates or any combination of the above.Alternatively, modifications in the crystal structure due to mutations,additions, substitutions, and/or deletions of amino acids, or otherchanges in any of the components that make up the crystal, could alsoyield variations in structure coordinates. Such slight variations in theindividual coordinates will have little affect on overall shape. If suchvariations are within an acceptable standard error as compared to theoriginal coordinates, the resulting three-dimensional shape isconsidered to be structurally equivalent. It should be noted that slightvariations in individual structure coordinates of a dengue virus Eprotein or other class II E protein or a complex thereof would not beexpected to significantly alter the nature of modulators that couldassociate with a druggable region thereof. Thus, for example, amodulator that bound to the active site of a dengue virus E protein orother class II E protein would also be expected to bind to or interferewith another active site whose structure coordinates define a shape thatfalls within the acceptable error.

A crystal structure of the present invention may be used to make astructural or computer model of the polypeptide, complex or portionthereof. A model may represent the secondary, tertiary and/or quaternarystructure of the polypeptide, complex or portion. The configurations ofpoints in space derived from structure coordinates according to theinvention can be visualized as, for example, a holographic image, astereodiagram, a model or a computer-displayed image, and the inventionthus includes such images, diagrams or models.

(iii) Structural Equivalents

Various computational analyses can be used to determine whether amolecule or the active site portion thereof is structurally equivalentwith respect to its three-dimensional structure, to all or part of astructure of a dengue virus E protein or other class II E protein or aportion thereof.

For the purpose of this invention, any molecule or complex or portionthereof, that has a root mean square deviation of conserved residuebackbone atoms (N, Cα, C, O) of less than about 1.75 Å, whensuperimposed on the relevant backbone atoms described by the referencestructure coordinates of a dengue virus E protein or other class II Eprotein, is considered “structurally equivalent” to the referencemolecule. That is to say, the crystal structures of those portions ofthe two molecules are substantially identical, within acceptable error.Alternatively, the root mean square deviation may be is less than about1.50, 1.40, 1.25, 1.0, 0.75, 0.5 or 0.35 Å.

The term “root mean square deviation” is understood in the art and meansthe square root of the arithmetic mean of the squares of the deviations.It is a way to express the deviation or variation from a trend orobject.

In another aspect, the present invention provides a scalablethree-dimensional configuration of points, at least a portion of saidpoints, and preferably all of said points, derived from structuralcoordinates of at least a portion of a dengue virus E protein or otherclass II E protein and having a root mean square deviation from thestructure coordinates of the dengue virus E protein or other class II Eprotein of less than 1.50, 1.40, 1.25, 1.0, 0.75, 0.5 or 0.35 Å. Incertain embodiments, the portion of a dengue virus E protein or otherclass II E protein is 25%, 33%, 50%, 66%, 75%, 85%, 90% or 95% or moreof the amino acid residues contained in the polypeptide.

In another aspect, the present invention provides a molecule or complexincluding a druggable region of a dengue virus E protein or other classII E protein, the druggable region being defined by a set of pointshaving a root mean square deviation of less than about 1.75 Å from thestructural coordinates for points representing (a) the backbone atoms ofthe amino acids contained in a druggable region of a dengue virus Eprotein or other class II E protein, (b) the side chain atoms (andoptionally the Cα atoms) of the amino acids contained in such druggableregion, or (c) all the atoms of the amino acids contained in suchdruggable region. In certain embodiments, only a portion of the aminoacids of a druggable region may be included in the set of points, suchas 25%, 33%, 50%, 66%, 75%, 85%, 90% or 95% or more of the amino acidresidues contained in the druggable region. In certain embodiments, theroot mean square deviation may be less than 1.50, 1.40, 1.25, 1.0, 0.75,0.5, or 0.35 Å. In still other embodiments, instead of a druggableregion, a stable domain, fragment or structural motif is used in placeof a druggable region.

(iv) Machine Displays and Machine Readable Storage Media

The invention provides a machine-readable storage medium including adata storage material encoded with machine readable data which, whenusing a machine programmed with instructions for using said data,displays a graphical three-dimensional representation of any of themolecules or complexes, or portions thereof, of this invention. Inanother embodiment, the graphical three-dimensional representation ofsuch molecule, complex or portion thereof includes the root mean squaredeviation of certain atoms of such molecule by a specified amount, suchas the backbone atoms by less than 0.8 Å. In another embodiment, astructural equivalent of such molecule, complex, or portion thereof, maybe displayed. In another embodiment, the portion may include a druggableregion of the dengue virus E protein or other class II E protein.

According to one embodiment, the invention provides a computer fordetermining at least a portion of the structure coordinatescorresponding to x-ray diffraction data obtained from a molecule orcomplex, wherein said computer includes: (a) a machine-readable datastorage medium comprising a data storage material encoded withmachine-readable data, wherein said data comprises at least a portion ofthe structural coordinates of a dengue virus E protein or other class IIE protein; (b) a machine-readable data storage medium comprising a datastorage material encoded with machine-readable data, wherein said datacomprises x-ray diffraction data from said molecule or complex; (c) aworking memory for storing instructions for processing saidmachine-readable data of (a) and (b); (d) a central-processing unitcoupled to said working memory and to said machine-readable data storagemedium of (a) and (b) for performing a Fourier transform of the machinereadable data of (a) and for processing said machine readable data of(b) into structure coordinates; and (e) a display coupled to saidcentral-processing unit for displaying said structure coordinates ofsaid molecule or complex. In certain embodiments, the structuralcoordinates displayed are structurally equivalent to the structuralcoordinates of a dengue virus E protein or other class II E protein.

In an alternative embodiment, the machine-readable data storage mediumincludes a data storage material encoded with a first set of machinereadable data which includes the Fourier transform of the structurecoordinates of a dengue virus E protein or other class II E protein or aportion thereof, and which, when using a machine programmed withinstructions for using said data, can be combined with a second set ofmachine readable data including the x-ray diffraction pattern of amolecule or complex to determine at least a portion of the structurecoordinates corresponding to the second set of machine readable data.

For example, a system for reading a data storage medium may include acomputer including a central processing unit (“CPU”), a working memorywhich may be, e.g., RAM (random access memory) or “core” memory, massstorage memory (such as one or more disk drives or CD-ROM drives), oneor more display devices (e.g., cathode-ray tube (“CRT”) displays, lightemitting diode (“LED”) displays, liquid crystal displays (“LCDs”),electroluminescent displays, vacuum fluorescent displays, field emissiondisplays (“FEDs”), plasma displays, projection panels, etc.), one ormore user input devices (e.g., keyboards, microphones, mice, touchscreens, etc.), one or more input lines, and one or more output lines,all of which are interconnected by a conventional bidirectional systembus. The system may be a stand-alone computer, or may be networked(e.g., through local area networks, wide area networks, intranets,extranets, or the internet) to other systems (e.g., computers, hosts,servers, etc.). The system may also include additional computercontrolled devices such as consumer electronics and appliances.

Input hardware may be coupled to the computer by input lines and may beimplemented in a variety of ways. Machine-readable data of thisinvention may be inputted via the use of a modem or modems connected bya telephone line or dedicated data line. Alternatively or additionally,the input hardware may include CD-ROM drives or disk drives. Inconjunction with a display terminal, a keyboard may also be used as aninput device.

Output hardware may be coupled to the computer by output lines and maysimilarly be implemented by conventional devices. By way of example, theoutput hardware may include a display device for displaying a graphicalrepresentation of an active site of this invention using a program suchas QUANTA as described herein. Output hardware might also include aprinter, so that hard copy output may be produced, or a disk drive, tostore system output for later use.

In operation, a CPU coordinates the use of the various input and outputdevices, coordinates data accesses from mass storage devices, accessesto and from working memory, and determines the sequence of dataprocessing steps. A number of programs may be used to process themachine-readable data of this invention. Such programs are discussed inreference to the computational methods of drug discovery as describedherein. References to components of the hardware system are included asappropriate throughout the following description of the data storagemedium.

Machine-readable storage devices useful in the present inventioninclude, but are not limited to, magnetic devices, electrical devices,optical devices, and combinations thereof. Examples of such data storagedevices include, but are not limited to, hard disk devices, CD devices,digital video disk devices, floppy disk devices, removable hard diskdevices, magneto-optic disk devices, magnetic tape devices, flash memorydevices, bubble memory devices, holographic storage devices, and anyother mass storage peripheral device. It should be understood that thesestorage devices include necessary hardware (e.g., drives, controllers,power supplies, etc.) as well as any necessary media (e.g., disks, flashcards, etc.) to enable the storage of data.

In one embodiment, the present invention contemplates a computerreadable storage medium comprising structural data, wherein the datainclude the identity and three-dimensional coordinates of a dengue virusE protein or other class II E protein or portion thereof. In anotheraspect, the present invention contemplates a database comprising theidentity and three-dimensional coordinates of a dengue virus E proteinor other class II E protein or a portion thereof. Alternatively, thepresent invention contemplates a database comprising a portion or all ofthe atomic coordinates of a dengue virus E protein or other class II Eprotein or portion thereof.

(v) Structurally Similar Molecules and Complexes

Structural coordinates for a dengue virus E protein or other class II Eprotein can be used to aid in obtaining structural information aboutanother molecule or complex. This method of the invention allowsdetermination of at least a portion of the three-dimensional structureof molecules or molecular complexes which contain one or more structuralfeatures that are similar to structural features of a dengue virus Eprotein or other class II E protein. Similar structural features caninclude, for example, regions of amino acid identity, conserved activesite or binding site motifs, and similarly arranged secondary structuralelements (e.g., α helices and β sheets). Many of the methods describedabove for determining the structure of a dengue virus E protein or otherclass II E protein may be used for this purpose as well.

For the present invention, a “structural homolog” is a polypeptide thatcontains one or more amino acid substitutions, deletions, additions, orrearrangements with respect to a subject amino acid sequence or otherdengue virus E protein or other class II E protein, but that, whenfolded into its native conformation, exhibits or is reasonably expectedto exhibit at least a portion of the tertiary (three-dimensional)structure of the polypeptide encoded by the related subject amino acidsequence or such other dengue virus E protein or other class II Eprotein. For example, structurally homologous molecules can containdeletions or additions of one or more contiguous or noncontiguous aminoacids, such as a loop or a domain. Structurally homologous moleculesalso include modified polypeptide molecules that have been chemically orenzymatically derivatized at one or more constituent amino acids,including side chain modifications, backbone modifications, and N- andC-terminal modifications including acetylation, hydroxylation,methylation, amidation, and the attachment of carbohydrate or lipidmoieties, cofactors, and the like.

By using molecular replacement, all or part of the structure coordinatesof a dengue virus E protein or other class II E protein can be used todetermine the structure of a crystallized molecule or complex whosestructure is unknown more quickly and efficiently than attempting todetermine such information ab initio. For example, in one embodimentthis invention provides a method of utilizing molecular replacement toobtain structural information about a molecule or complex whosestructure is unknown including: (a) crystallizing the molecule orcomplex of unknown structure; (b) generating an x-ray diffractionpattern from said crystallized molecule or complex; and (c) applying atleast a portion of the structure coordinates for a dengue virus Eprotein or other class II E protein to the x-ray diffraction pattern togenerate a three-dimensional electron density map of the molecule orcomplex whose structure is unknown.

In another aspect, the present invention provides a method forgenerating a preliminary model of a molecule or complex whose structurecoordinates are unknown, by orienting and positioning the relevantportion of a dengue virus E protein or other class II E protein withinthe unit cell of the crystal of the unknown molecule or complex so asbest to account for the observed x-ray diffraction pattern of thecrystal of the molecule or complex whose structure is unknown.

Structural information about a portion of any crystallized molecule orcomplex that is sufficiently structurally similar to a portion of adengue virus E protein or other class II E protein may be resolved bythis method. In addition to a molecule that shares one or morestructural features with a dengue virus E protein or other class II Eprotein, a molecule that has similar bioactivity, such as the samecatalytic activity, substrate specificity or ligand binding activity asa dengue virus E protein or other class II E protein, may also besufficiently structurally similar to a dengue virus E protein or otherclass II E protein to permit use of the structure coordinates for adengue virus E protein or other class II E protein to solve its crystalstructure.

In another aspect, the method of molecular replacement is utilized toobtain structural information about a complex containing a dengue virusE protein or other class II E protein, such as a complex between amodulator and a dengue virus E protein or other class II E protein (or adomain, fragment, ortholog, homolog etc. thereof). In certain instances,the complex includes a dengue virus E protein or other class II Eprotein (or a domain, fragment, ortholog, homolog etc. thereof)co-complexed with a modulator. In one embodiment of the invention, thedengue virus E protein or other class II E protein is complexed withβ-OG or other detergent molecule. In yet another embodiment, the complexis a dengue virus E protein or other class II E protein trimer. Incertain embodiments, the trimer may additionally comprise a modulator.For example, in one embodiment, the present invention contemplates amethod for making a crystallized complex comprising a dengue virus Eprotein or other class II E protein, or a fragment thereof, and acompound, the method comprising: (a) crystallizing a dengue virus Eprotein or other class II E protein such that the crystals will diffractx-rays to a resolution of 3.5 Å or better; and (b) soaking the crystalin a solution comprising the compound, thereby producing a crystallizedcomplex comprising the polypeptide and the compound.

Using homology modeling, a computer model of a structural homolog orother polypeptide can be built or refined without crystallizing themolecule. For example, in another aspect, the present invention providesa computer-assisted method for homology modeling a structural homolog ofa dengue virus E protein or other class II E protein including: aligningthe amino acid sequence of a known or suspected structural homolog withthe amino acid sequence of a dengue virus E protein or other class II Eprotein and incorporating the sequence of the homolog into a model of adengue virus E protein or other class II E protein derived from atomicstructure coordinates to yield a preliminary model of the homolog;subjecting the preliminary model to energy minimization to yield anenergy minimized model; remodeling regions of the energy minimized modelwhere stereochemistry restraints are violated to yield a final model ofthe homolog.

In another embodiment, the present invention contemplates a method fordetermining the crystal structure of a homolog of a polypeptide encodedby a subject amino acid sequence, or equivalent thereof, the methodcomprising: (a) providing the three dimensional structure of acrystallized polypeptide of a subject amino acid sequence, or a fragmentthereof; (b) obtaining crystals of a homologous polypeptide comprisingan amino acid sequence that is at least 80% identical to the subjectamino acid sequence such that the three dimensional structure of thecrystallized homologous polypeptide may be determined to a resolution of3.5 Å or better; and (c) determining the three dimensional structure ofthe crystallized homologous polypeptide by x-ray crystallography basedon the atomic coordinates of the three dimensional structure provided instep (a). In certain instances of the foregoing method, the atomiccoordinates for the homologous polypeptide have a root mean squaredeviation from the backbone atoms of the polypeptide encoded by theapplicable subject amino acid sequence, or a fragment thereof, of notmore than 1.5 Å for all backbone atoms shared in common with thehomologous polypeptide and the such encoded polypeptide, or a fragmentthereof.

(vi) NMR Analysis Using X-ray Structural Data

In another aspect, the structural coordinates of a known crystalstructure may be applied to nuclear magnetic resonance data to determinethe three dimensional structures of polypeptides with uncharacterized orincompletely characterized structure. (See for example, Wuthrich, 1986,John Wiley and Sons, New York: 176-199; Pflugrath et al., 1986, J.Molecular Biology 189: 383-386; Kline et al., 1986 J. Molecular Biology189:377-382). While the secondary structure of a polypeptide may oftenbe determined by NMR data, the spatial connections between individualpieces of secondary structure are not as readily determined. Thestructural coordinates of a polypeptide defined by x-ray crystallographycan guide the NMR spectroscopist to an understanding of the spatialinteractions between secondary structural elements in a polypeptide ofrelated structure. Information on spatial interactions between secondarystructural elements can greatly simplify NOE data from two-dimensionalNMR experiments. In addition, applying the structural coordinates afterthe determination of secondary structure by NMR techniques simplifiesthe assignment of NOE's relating to particular amino acids in thepolypeptide sequence.

In an embodiment, the invention relates to a method of determining threedimensional structures of polypeptides with unknown structures, byapplying the structural coordinates of a crystal of the presentinvention to nuclear magnetic resonance data of the unknown structure.This method comprises the steps of: (a) determining the secondarystructure of an unknown structure using NMR data; and (b) simplifyingthe assignment of through-space interactions of amino acids. The term“through-space interactions” defines the orientation of the secondarystructural elements in the three dimensional structure and the distancesbetween amino acids from different portions of the amino acid sequence.The term “assignment” defines a method of analyzing NMR data andidentifying which amino acids give rise to signals in the NMR spectrum.

For all of this section on x-ray crystallography, see also Brooks et al.(1983) J Comput Chem 4:187-217; Weiner et al (1981) J. Comput. Chem.106: 765; Eisenfield et al. (1991) Am J Physiol 261:C376-386; Lybrand(1991) J Pharm Belg 46:49-54; Froimowitz (1990) Biotechniques 8:640-644;Burbam et al. (1990) Proteins 7:99-111; Pedersen (1985) Environ HealthPerspect 61:185-190; and Kini et al. (1991) J Biomol Struct Dyn9:475-488; Ryckaert et al. (1977) J Comput Phys 23:327; Van Gunsteren etal. (1977) Mol Phys 34:13.11; Anderson (1983) J Comput Phys 52:24; J.Mol. Biol. 48: 442-453, 1970; Dayhoff et al., Meth. Enzymol. 91:524-545, 1983; Henikoff and Henikoff, Proc. Nat. Acad. Sci. USA 89:10915-10919, 1992; J. Mol. Biol. 233: 716-738, 1993; Methods inEnzymology, Volume 276, Macromolecular crystallography, Part A, ISBN0-12-182177-3 and Volume 277, Macromolecular crystallography, Part B,ISBN 0-12-182178-1, Eds. Charles W. Carter, Jr. and Robert M. Sweet(1997), Academic Press, San Diego; Pfuetzner, et al., J. Biol. Chem.272: 430-434 (1997).

H.2. Analysis of Proteins by Nuclear Magnetic Resonance (NMR)

NMR may be used to characterize the structure of a polypeptide inaccordance with the methods of the invention. In particular, NMR can beused, for example, to determine the three dimensional structure, theconformational state, the aggregation level, the state of proteinfolding/unfolding or the dynamic properties of a polypeptide. Forexample, the present invention contemplates a method for determiningthree dimensional structure information of a dengue virus E protein orother class II E protein, the method comprising: (a) generating apurified isotopically labeled dengue virus E protein or other class II Eprotein; and (b) subjecting the polypeptide to NMR spectroscopicanalysis, thereby determining information about its three dimensionalstructure.

Interaction between a polypeptide and another molecule can also bemonitored using NMR. Thus, the invention encompasses methods fordetecting, designing and characterizing interactions between apolypeptide and another molecule, including polypeptides, nucleic acidsand small molecules, utilizing NMR techniques. For example, the presentinvention contemplates a method for determining three dimensionalstructure information of a dengue virus E protein or other class II Eprotein, or a fragment thereof, while the polypeptide is complexed withanother molecule, the method comprising: (a) generating a purifiedisotopically labeled dengue virus E protein or other class II E protein,or a fragment thereof; (b) forming a complex between the polypeptide andthe other molecule; and (c) subjecting the complex to NMR spectroscopicanalysis, thereby determining information about the three dimensionalstructure of the polypeptide. In another aspect, the present inventioncontemplates a method for identifying compounds that bind to a denguevirus E protein or other class II E protein, or a fragment thereof, themethod comprising: (a) generating a first NMR spectrum of anisotopically labeled dengue virus E protein or other class II E protein,or a fragment thereof; (b) exposing the polypeptide to one or morechemical compounds; (c) generating a second NMR spectrum of thepolypeptide which has been exposed to one or more chemical compounds;and (d) comparing the first and second spectra to determine differencesbetween the first and the second spectra, wherein the differences areindicative of one or more compounds that have bound to the polypeptide.

Briefly, the NMR technique involves placing the material to be examined(usually in a suitable solvent) in a powerful magnetic field andirradiating it with radio frequency (rf) electromagnetic radiation. Thenuclei of the various atoms will align themselves with the magneticfield until energized by the rf radiation. They then absorb thisresonant energy and re-radiate it at a frequency dependent on i) thetype of nucleus and ii) its atomic environment. Moreover, resonantenergy may be passed from one nucleus to another, either through bondsor through three-dimensional space, thus giving information about theenvironment of a particular nucleus and nuclei in its vicinity.

However, it is important to recognize that not all nuclei are NMRactive. Indeed, not all isotopes of the same element are active. Forexample, whereas “ordinary” hydrogen, ¹H, is NMR active, heavy hydrogen(deuterium), ²H, is not active in the same way. Thus, any material thatnormally contains ¹H hydrogen may be rendered “invisible” in thehydrogen NMR spectrum by replacing all or almost all the ¹H hydrogenswith ²H. It is for this reason that NMR spectroscopic analyses ofwater-soluble materials frequently are performed in ²H₂O (or deuterium)to eliminate the water signal.

Conversely, “ordinary” carbon, ¹²C, is NMR inactive whereas the stableisotope, ¹³C, present to about 1% of total carbon in nature, is active.Similarly, while “ordinary” nitrogen, ¹⁴N, is NMR active, it hasundesirable properties for NMR and resonates at a different frequencyfrom the stable isotope ¹⁵N, present to about 0.4% of total nitrogen innature.

By labeling proteins with ¹⁵N and ¹⁵N/¹³C, it is possible to conductanalytical NMR of macromolecules with weights of 15 kD and 40 kD,respectively. More recently, partial deuteration of the protein inaddition to ¹³C- and ¹⁵N-labeling has increased the possible weight ofproteins and protein complexes for NMR analysis still further, toapproximately 60-70 kD. See Shan et al., J. Am. Chem. Soc.,118:6570-6579 (1996); L. E. Kay, Methods Enzymol., 339:174-203 (2001);and K. H. Gardner & L. E. Kay, Annu Rev Biophys Biomol Struct.,27:357-406 (1998); and references cited therein.

Isotopic substitution may be accomplished by growing a bacterium oryeast or other type of cultured cells, transformed by geneticengineering to produce the protein of choice, in a growth mediumcontaining ¹³C-, ¹⁵N- and/or ²H-labeled substrates. In certaininstances, bacterial growth media consists of ¹³C-labeled glucose and/or¹⁵N-labeled ammonium salts dissolved in D₂O where necessary. Kay, L. etal., Science, 249:411 (1990) and references therein and Bax, A., J. Am.Chem. Soc., 115, 4369 (1993). More recently, isotopically labeled mediaespecially adapted for the labeling of bacterially producedmacromolecules have been described. See U.S. Pat. No. 5,324,658.

The goal of these methods has been to achieve universal and/or randomisotopic enrichment of all of the amino acids of the protein. Bycontrast, other methods allow only certain residues to be relativelyenriched in ¹H, ²H, ¹³C and ¹⁵N. For example, Kay et al., J. Mol. Biol.,263, 627-636 (1996) and Kay et al., J. Am. Chem. Soc., 119, 7599-7600(1997) have described methods whereby isoleucine, alanine, valine andleucine residues in a protein may be labeled with ²H, ¹³C and ¹⁵N, andmay be specifically labeled with ¹H at the terminal methyl position. Inthis way, study of the proton-proton interactions between some aminoacids may be facilitated. Similarly, a cell-free system has beendescribed by Yokoyama et al., J. Biomol. NMR, 6(2), 129-134 (1995),wherein a transcription-translation system derived from E. coli was usedto express human Ha-Ras protein incorporating ¹⁵N into serine and/oraspartic acid.

Techniques for producing isotopically labeled proteins andmacromolecules, such as glycoproteins, in mammalian or insect cells havebeen described. See U.S. Pat. Nos. 5,393,669 and 5,627,044; Weller, C.T., Biochem., 35, 8815-23 (1996) and Lustbader, J. W., J. Biomol. NMR,7, 295-304 (1996). Other methods for producing polypeptides and othermolecules with labels appropriate for NMR are known in the art.

The present invention contemplates using a variety of solvents which areappropriate for NMR. For ¹H NMR, a deuterium lock solvent may be used.Exemplary deuterium lock solvents include acetone (CD₃COCD₃), chloroform(CDCl₃), dichloro methane (CD₂Cl₂), methylnitrile (CD₃CN), benzene(C₆D₆), water (D₂O), diethylether ((CD₃CD₂)₂O), dimethylether ((CD₃)₂O),N,N-dimethylformiamide ((CD₃)₂NCDO), dimethyl sulfoxide (CD₃SOCD₃),ethanol (CD₃CD₂OD), methanol (CD₃OD), tetrahydrofuran (C₄D₈O), toluene(C₆D₅CD₃), pyridine (C₅D₅N) and cyclohexane (C₆H₁₂). For example, thepresent invention contemplates a composition comprising a dengue virus Eprotein or other class II E protein and a deuterium lock solvent.

The 2-dimensional ¹H-¹⁵N HSQC (Heteronuclear Single Quantum Correlation)spectrum provides a diagnostic fingerprint of conformational state,aggregation level, state of protein folding, and dynamic properties of apolypeptide (Yee et al, PNAS 99, 1825-30 (2002)). Polypeptides inaqueous solution usually populate an ensemble of 3-dimensionalstructures which can be determined by NMR. When the polypeptide is astable globular protein or domain of a protein, then the ensemble ofsolution structures is one of very closely related conformations. Inthis case, one peak is expected for each non-proline residue with adispersion of resonance frequencies with roughly equal intensity.Additional pairs of peaks from side-chain NH₂ groups are also oftenobserved, and correspond to the approximate number of Gln and Asnresidues in the protein. This type of HSQC spectra usually indicatesthat the protein is amenable to structure determination by NMR methods.

If the HSQC spectrum shows well-dispersed peaks but there are either toofew or too many in number, and/or the peak intensities differ throughoutthe spectrum, then the protein likely does not exist in a singleglobular conformation. Such spectral features are indicative ofconformational heterogeneity with slow or nonexistent inter-conversionbetween states (too many peaks) or the presence of dynamic processes onan intermediate timescale that can broaden and obscure the NMR signals.Proteins with this type of spectrum can sometimes be stabilized into asingle conformation by changing either the protein construct, thesolution conditions, temperature or by binding of another molecule.

The ¹H-¹⁵N HSQC can also indicate whether a protein has formed largenonspecific aggregates or has dynamic properties. Alternatively,proteins that are largely unfolded, e.g., having very little regularsecondary structure, result in ¹H-¹⁵N HSQC spectra in which the peaksare all very narrow and intense, but have very little spectraldispersion in the ¹⁵N-dimension. This reflects the fact that many ormost of the amide groups of amino acids in unfolded polypeptides aresolvent exposed and experience similar chemical environments resultingin similar ¹H chemical shifts.

The use of the ¹H—¹⁵N HSQC, can thus allow the rapid characterization ofthe conformational state, aggregation level, state of protein folding,and dynamic properties of a polypeptide. Additionally, other 2D spectrasuch as ¹H—¹³C HSQC, or HNCO spectra can also be used in a similarmanner. Further use of the ¹H—¹⁵N HSQC combined with relaxationmeasurements can reveal the molecular rotational correlation time anddynamic properties of polypeptides. The rotational correlation time isproportional to size of the protein and therefore can reveal if it formsspecific homo-oligomers such as homodimers, homotetramers, etc.

The structure of stable globular proteins can be determined through aseries of well-described procedures. For a general review of structuredetermination of globular proteins in solution by NMR spectroscopy, seeWüthrich, Science 243: 45-50 (1989). See also, Billeter et al., J. Mol.Biol. 155: 321-346 (1982). Current methods for structure determinationusually require the complete or nearly complete sequence-specificassignment of ¹H-resonance frequencies of the protein and subsequentidentification of approximate inter-hydrogen distances (from nuclearOverhauser effect (NOE) spectra) for use in restrained moleculardynamics calculations of the protein conformation. One approach for theanalysis of NMR resonance assignments was first outlined by Wüthrich,Wagner and co-workers (Wüthrich, “NMR or proteins and nucleic acids”Wiley, New York, N.Y. (1986); Wüthrich, Science 243: 45-50 (1989);Billeter et al., J. Mol. Biol. 155: 321-346 (1982)). Newer methods fordetermining the structures of globular proteins include the use ofresidual dipolar coupling restraints (Tian et al., J Am Chem Soc. 2001Nov. 28; 123(47):11791-6; Bax et al, Methods Enzymol. 2001; 339:127-74)and empirically derived conformational restraints (Zweckstetter & Bax, JAm Chem Soc. 2001 Sep. 26; 123(38):9490-1). It has also been shown thatit may be possible to determine structures of globular proteins usingonly un-assigned NOE measurements. NMR may also be used to determineensembles of many inter-converting, unfolded conformations (Choy andForman-Kay, J Mol Biol. 2001 May 18; 308(5):1011-32).

NMR analysis of a polypeptide in the presence and absence of a testcompound (e.g., a polypeptide, nucleic acid or small molecule) may beused to characterize interactions between a polypeptide and anothermolecule. Because the ¹H—¹⁵N HSQC spectrum and other simple 2D NMRexperiments can be obtained very quickly (on the order of minutesdepending on protein concentration and NMR instrumentation), they arevery useful for rapidly testing whether a polypeptide is able to bind toanother molecule. Changes in the resonance frequency (in one or bothdimensions) of one or more peaks in the HSQC spectrum indicate aninteraction with another molecule. Often only a subset of the peaks willhave changes in resonance frequency upon binding to anther molecule,allowing one to map onto the structure those residues directly involvedin the interaction or involved in conformational changes as a result ofthe interaction. If the interacting molecule is relatively large(protein or nucleic acid) the peak widths will also broaden due to theincreased rotational correlation time of the complex. In some cases thepeaks involved in the interaction may actually disappear from the NMRspectrum if the interacting molecule is in intermediate exchange on theNMR timescale (i.e., exchanging on and off the polypeptide at afrequency that is similar to the resonance frequency of the monitorednuclei).

To facilitate the acquisition of NMR data on a large number of compounds(e.g., a library of synthetic or naturally-occurring small organiccompounds), a sample changer may be employed. Using the sample changer,a larger number of samples, numbering 60 or more, may be run unattended.To facilitate processing of the NMR data, computer programs are used totransfer and automatically process the multiple one-dimensional NMRdata.

In one embodiment, the invention provides a screening method foridentifying small molecules capable of interacting with a dengue virus Eprotein or other class II E protein. In one example, the screeningprocess begins with the generation or acquisition of either aT₂-filtered or a diffusion-filtered one-dimensional proton spectrum ofthe compound or mixture of compounds. Means for generating T₂-filteredor diffusion-filtered one-dimensional proton spectra are well known inthe art (see, e.g., S. Meiboom and D. Gill, Rev. Sci. Instrum.29:688(1958), S. J. Gibbs and C. S. Johnson, Jr. J. Main. Reson.93:395-402 (1991) and A. S. Altieri, et al. J. Am. Chem. Soc. 117:7566-7567 (1995)).

Following acquisition of the first spectrum for the molecules, the ¹⁵N—or ¹³C-labeled polypeptide is exposed to one or more molecules. Wheremore than one test compound is to be tested simultaneously, it ispreferred to use a library of compounds such as a plurality of smallmolecules. Such molecules are typically dissolved in perdeuterateddimethylsulfoxide. The compounds in the library may be purchased fromvendors or created according to desired needs.

Individual compounds may be selected inter alia on the basis of size andmolecular diversity for maximizing the possibility of discoveringcompounds that interact with widely diverse binding sites of a subjectamino acid sequence or other polypeptides of the invention.

The NMR screening process of the present invention utilizes a range oftest compound concentrations, e.g., from about 0.05 to about 1.0 mM. Atthose exemplary concentrations, compounds which are acidic or basic maysignificantly change the pH of buffered protein solutions. Chemicalshifts are sensitive to pH changes as well as direct bindinginteractions, and false-positive chemical shift changes, which are notthe result of test compound binding but of changes in pH, may thereforebe observed. It may therefore be necessary to ensure that the pH of thebuffered solution does not change upon addition of the test compound.

Following exposure of the test compounds to a polypeptide (e.g., thetarget molecule for the experiment) a second one-dimensional T₂- ordiffusion-filtered spectrum is generated. For the T₂-filtered approach,that second spectrum is generated in the same manner as set forth above.The first and second spectra are then compared to determine whetherthere are any differences between the two spectra. Differences in theone-dimensional T₂-filtered spectra indicate that the compound isbinding to, or otherwise interacting with, the target molecule. Thosedifferences are determined using standard procedures well known in theart. For the diffusion-filtered method, the second spectrum is generatedby looking at the spectral differences between low and high gradientstrengths—thus selecting for those compounds whose diffusion rates arecomparable to that observed in the absence of target molecule.

To discover additional molecules that bind to the protein, molecules areselected for testing based on the structure/activity relationships fromthe initial screen and/or structural information on the initial leadswhen bound to the protein. By way of example, the initial screening mayresult in the identification of compounds, all of which contain anaromatic ring. The second round of screening would then use otheraromatic molecules as the test compounds.

In another embodiment, the methods of the invention utilize a processfor detecting the binding of one ligand to a polypeptide in the presenceof a second ligand. In accordance with this embodiment, a polypeptide isbound to the second ligand before exposing the polypeptide to the testcompounds.

For more information on NMR methods encompassed by the presentinvention, see also: U.S. Pat. Nos. 5,668,734; 6,194,179; 6,162,627;6,043,024; 5,817,474; 5,891,642; 5,989,827; 5,891,643; 6,077,682; WO00/05414; WO 99/22019; Cavanagh, et al., Protein NMR Spectroscopy,Principles and Practice, 1996, Academic Press; Clore, et al., NMR ofProteins. In Topics in Molecular and Structural Biology, 1993, S.Neidle, Fuller, W., and Cohen, J. S., eds., Macmillan Press, Ltd.,London; and Christendat et al., Nature Structural Biology 7: 903-909(2000).

EXEMPLIFICATION

The invention having been generally described, may be more readilyunderstood by reference to the following examples, which are includedmerely for purposes of illustration of certain aspects and embodimentsof the present invention, and are not intended to limit the invention inany way.

Example 1 Determination of the Pre-fusion Structure of E Protein fromDengue Virus Type S1

A. Expression, Purification and Crystallization of E Protein from DengueVirus Type 2 S1

E protein from dengue virus type 2 S1 strain (Hahn et al., 1988) wassupplied by Hawaii Biotechnology Group, Inc. (Aiea, Hi.). The constructthat encoded the E protein sequence (SEQ ID NO 2) spans nucleotides937-2118 of GenBank Accession Number M19197 and is described in detailin Hahn, Y. S. et al (1988), Virology 162, p 167-180. The primers thatwere used to amplify the dengue sE sequence by PCR:primer D2E937p25′-cttctagatctcgagtacccgggacc ATG CGC TGC ATA GGA ATA TC-3′ and primerD2E2121m 5′-gctctagagtcga cta tta TCC TTT CTT GAA CCA G-3′. Nucleotidescorresponding to dengue cDNA are in upper case; non-dengue sequence isin lower case. The protein was expressed in Drosophila melanogasterSchneider 2 cells (ATCC, Rockville, Md.) from a pMtt vector (SmithKlineBeecham) containing the dengue 2 prM and E genes (nucleotides 1-1185) asdescribed by Ivy et al. (1997). The resulting prM-E preprotein wasprocessed during secretion to yield soluble E protein, which waspurified from the cell culture medium by immunoaffinity chromatography(Cuzzubbo et al., 2001).

Crystals grew from a 10 g/l solution at 4° C. by hanging drop vapordiffusion in 11% PEG 8000, 1 M sodium formate, 20% glycerol and 0.1 MHEPES pH 8. The addition of 0.5% n-octyl-β-D-glucoside prior tocrystallization significantly improved the abundance and diffractionlimit of the crystals. Dimensions of the primitive hexagonal cell wereapproximately a=b=81 Å, c=287 Å, with two molecules per asymmetric unit.An additional primitive hexagonal crystal form was observed, with celldimensions a=b=75 Å, c=145 Å, and one molecule per asymmetric unit.

B. Data Collection and Processing

Crystals were derivatized by soaking in mother liquor containing 0.5 mMK₂PtCl₄, 0.5 mM Yb₂(SO₄)₃, 0.5 mM KAu(CN)₂, or 10 mM Me₃PbAc for 24 h.Datasets were collected at 100 K on beamlines A1 and F1 of the CornellHigh Energy Synchrotron Source (Cornell University), except the ‘Native1’ dataset (see Table 1), which was collected on beamline ID-19 at theAdvanced Photon Source (Argonne National Laboratory). The data wereprocessed with HKL (Otwinowski and Minor, 1997). Table 1 lists thecrystallographic data statistics for the structure of dengue virus Eprotein in complex with n-octyl-β-D-glucoside.

TABLE 1 CRYSTALLOGRAPHIC DATA STATISTICS. β-OG, N-OCTYL-β-D-GLUCOSIDEData collection and structure solution Dataset Native1 Native2 Me₃PbAcK₂PtCl₄ YbSO₄ AuCN Native3 Native4 Conc. βOG (mM) 17 17 17 17 17 17 0 0Resolution range (Å) 50-2.4 30-2.47 30-2.8 30-2.8 30-2.8 30-2.8 50-2.7550-3.0 Cell edges a(=b)/c 81.6/287.4 81.2/286.6 81.3/286.8 81.3/286.981.1/286.5 81.2/285.5 81.5/288.6 74.6/144.7 % completeness¹  97 (74)  92(45)  99 (98)  97 (88)  99 (97)  97 (99)  90 (49)  96 (82) I/σ(I)¹ 26.2(3.3) 15.1 (1.8) 17.3 (4.0) 17.8 (4.7) 15.5 (2.7) 11.9 (6.2) 21.7 (2.5)13.6 (2.0) R_(merge) ^(1,2) (%)  6.9 (28.9)  6.1 (26.8)  6.0 (27.4)  8.2(31.1)  7.4 (39.2)  5.3 (7.7)  7.9 (40.9)  8.4 (47.6) Number of sites  2 1  2  2 Phasing power³ 0.85/1.3 0.43/0.52 0.25/0.49 0.36/0.54 (centric/acentric) (Sharp) Phasing power³    0.57    0.67    0.27    0.40(anomalous) (Sharp) FOM⁴ centric 0.68/0.24 (CNS/Sharp) FOM⁴ acentric0.34/0.24 (CNS/Sharp) Model building and refinement Native1 Native3Resolution range 50-2.4 50-2.75 Unique reflections 44,435 24,851R_(cryst) ^(1,5) 0.2633 0.2610 R_(free) ^(1,6) 0.2938 0.2964 AverageB-factor (Å²) Protein (chain A/B) 88.7/64.5 79.1/72.8 Solvent 84.8 78.9R.m.s. deviation Bond length (Å) 0.011 0.009 Bond angle (°) 1.706 1.415Bonded B-factor (Å²) Main chain 4.37 3.30 Side chain 7.39 5.87Ramachandran plot (%) Favored 82.3 73.2 Allowed 17.0 26.8 Generous 0.70.0 Disallowed 0 0 ¹Number in parentheses is for the highest resolutionshell. ²R_(merge) = Σ_(hkl)|I − <I>|/Σ_(hkl)Σ_(i)(I) ³Phasing power =(FH/Lack of closure) ⁴FOM = ((cosφ)² + (sinφ)²)^(1/2) ⁵R_(cryst) =Σ_(hkl)||F_(obs)| − |<F_(calc)>||/Σ_(hkl)|F_(obs)| ⁶R_(free) = R_(cryst)using 5% of F_(obs) sequestered before refinement

C. Structure Determination and Refinement

The pronounced anisotropy of the datasets was corrected by scaling eachdataset anisotropically to a calculated dataset obtained from anarbitrary set of atomic coordinates. The datasets were scaled to themost isomorphous native dataset, ‘Native2’ (Table 1), and isomorphousdifference Pattersons were calculated with SOLVE (Terwilliger andBerendzen, 1999). Two initial heavy atom sites were identified using thelead derivative. Additional sites were located in the three otherderivative datasets using cross-difference Fourier maps. Initial phaseswere optimized by refining the heavy atom parameters against maximumlikelihood targets with SHARP (La Fortelle and Bricogne, 1997). Phaseswere improved by solvent flattening and two-fold non-crystallographicsymmetry (NCS) averaging with DM (Collaborative Computational Project,1994) and RESOLVE (Terwilliger, 1999). The solvent content was assumedto be 43%. The space group was determined as P3₁21, based oninterpretable features in density-modified maps. An initial model wasbuilt into the maps with O (Jones et al., 1991). The atomic coordinateswere refined against the best native dataset, ‘Native1’ (Table 1), firstas a rigid body, then by simulated annealing using torsion angledynamics with CNS (Briinger et al., 1998). Further cycles also includedrestrained refinement of B-factors for individual atoms and energyminimization against maximum likelihood targets with CNS. Because theelectron density for one of the molecules in the dengue E dimer was morepoorly defined than the other, the atomic coordinates two molecules weretightly restrained throughout refinement and therefore have very similarstructures: the Rmsd is 0.34 Å (including side chain atoms). TheB-factors were left unrestrained due to a large difference in overallB-factors for the two molecules in the asymmetric unit. The atomic modelwas completed using 2F_(o)-F_(c) and F_(o)-F_(c) Fourier maps. 137 watermolecules were added using an automated procedure in CNS and by visualinspection. The final model also includes two glycans, and one moleculeof n-octyl-β-D-glucoside (β-OG) per protein molecule.

The structure of dengue E in the absence of β-OG was determined byrefining the atomic coordinates against the ‘Native3’ dataset (Table 1),which was collected from a crystal grown in the absence of β-OG. Theprotein atoms were first refined as six rigid bodies, corresponding todomains I, II and III of each of the two chains in the asymmetric unit.The k1 hairpin (residues 270-279), and residues 165-169 were completelyrebuilt. Further refinement cycles consisted of simulated annealingusing torsion angle dynamics, restrained B-factor refinement forindividual atoms, and energy minimization against maximum likelihoodtargets with CNS (Brunger et al., 1998). The structure of unligandeddengue E was determined in a second crystal form (dataset ‘Native4’) bymolecular replacement, using a dengue E monomer as the search model inAMoRe (Navaza, 2001). The space group was identified in the translationsearch as P3₂21, with only one molecule per asymmetric unit. Rigid bodyrefinement of domains I, II and III resulted in substantial shifts,especially for domain II, which rotated approximately 5° with respect todomains I and III. The axis of rotation passes through residue 193, andis roughly perpendicular to the dyad axis of the dimer. Furtherrefinement cycles consisted of simulated annealing, restrainedindividual B-factor refinement, and energy minimization with CNS(Brunger et al., 1998). The stereochemical quality of each atomic modelwas validated with PROCHECK (Laskowski et al., 1993). Statistics fordata collection, phasing and refinement are presented in (Table 1).

D. Atomic Coordinates

The coordinates and structure factors were deposited on Jan. 16, 2003 inthe Protein Data Bank under accession numbers 1OKE and 1OAN.

E. General Description of Structure and Druggable Regions

A hydrophobic pocket in the dengue E protein must open up as a firststep in the low-pH induced conformational transition, and in one of ourcrystal structures, a small molecule (β-octyl glucoside) is bound inthis pocket.

We have determined the structure of a soluble fragment (residues 1-394)of the E protein from dengue virus type 2. This fragment contains allbut about 45 residues of the E-protein ectodomain. It resembles closely,in its dimeric structure and in the details of its protein fold, the Eprotein from TBE, studied previously (Rey et al, 1995). Domain I, thecentral, 9-strand, β-barrel, organizes the structure. Insertions betweenstrands D and E and strands H and I form the elongated domain II, whichbears the fusion peptide at its tip. Domain III is an Ig-like module.Each domain of dengue sE has the same folded structure as its TBEcounterpart, but several loops diverge in conformation. The relativedomain orientations are also slightly different, consistent with thenotion that the links between them might be flexible.

One consistent difference between E proteins from tick-borne andmosquito-borne flaviviruses is the presence in the latter of anadditional four residues (382-385) between strands F and G of domainIII. In our structure, these residues form a compact solvent-exposedbulge. Their relatively high temperature factors suggest some degree offlexibility. This loop has been implicated in receptor binding (Crilland Roehrig, 2001).

FIG. 1B shows the three-domain structure of the dengue virus sE dimer.Domain I, the central, 9-strand, β-barrel, organizes the structure.Insertions between strands D and E and strands H and I form theelongated domain II, which bears the fusion peptide at its tip. DomainIII is an Ig-like module. In all three domains, β-strands predominate.Each domain of dengue sE has the same folded structure as its TBEcounterpart, but several loops diverge in conformation. The relativedomain orientations are also slightly different, consistent with thenotion that the links between them might be flexible.

There are two glycosylated asparagines on each dengue E subunit—Asn 153on domain I and Asn 67 on domain II. Asn153, conserved in mostflavivirus envelope proteins, bears a structure modeled here as atetrasaccharide, although it contains additional, poorly ordered sugars.The fourth sugar is a mannose, which appears to be important for viralentry (Hung et al., 1999). The glycan projects outward from the surfaceof the protein, and somewhat discontinuous electron-density featuressuggest that it makes a crystal contact with the Asn 67 glycan ofanother sE dimer (FIG. 2). In TBE, its homolog extends laterally acrossthe dimer interface and “covers” the fusion peptide (residues 100-108)on domain II of the dimer partner. In the absence of a crystal contact,the dengue Asn153 oligosaccharide might do likewise. Indeed,stabilization of the dimer by the oligosaccharide would be consistentwith the properties of non-glycosylated mutants of dengue, which fusewith target membranes at a higher pH (Guirakhoo et al., 1993; Kawano etal., 1993; Pletnev et al., 1993).

We have examined crystals grown in both the presence and the absence ofthe detergent, β-OG. The key difference between the two structures is inthe k1 loop, which shifts toward the dimer contact in the presence ofthe detergent. This shift closes the “holes” to either side of thetwofold axis and opens a tapering, hydrophobic channel at the interfacebetween domains I and II. This channel accepts a single β-OG molecule.The β-OG head group lies at the channel's mouth, with several hydrogenbonds fixing a well-ordered orientation; the hydrocarbon chain projectswell into the channel's cavity. In TBEV sE, which was studied in theabsence of β-OG, the k1 loop is in the “closed” position, and thehydrophobic residues are buried.

The most significant difference between the structures of dengue sE withand without β-OG is an altered conformation of the k1 loop, which shiftstoward the dimer contact in the presence of the detergent. To effectthis movement, strands k and l switch sheets, from F0E0D01k to efgk1(FIG. 1C; see also FIG. 2 of Rey et al. (1995). The shift closes the“holes” along the dimer contact to either side of the twofold axis andopens a tapering, hydrophobic channel at the interface between domains Iand II. This channel accepts a single β-OG molecule. The β-OG glucosylhead group lies at the channel's mouth, with several hydrogen bondsfixing a well-ordered orientation; the hydrocarbon chain projects wellinto the channel's cavity. In TBEV sE, which was studied in the absenceof β-OG, the k1 loop is in the “closed” position, and the hydrophobicresidues are buried (FIG. 1D).

Mutations of residues that participate in the domain I/II interface justdescribed alter the threshold pH for fusion (FIG. 3). Most of theminvolve side chains in the β-OG binding pocket. We take this correlationas a strong indication that domains I and II indeed change orientationduring the fusion-promoting conformational change. We propose that theopening of the k1 hairpin pries open the hydrophobic interface, causingdomain II to hinge outwards and to project the fusion peptide at its tiptoward the membrane of the target cell. Two crystallographicobservations are consistent with such a hinge. In a second crystal formof dengue sE without β-OG, domain II shifts by just this type ofdisplacement (about 5°), with respect to domains I and III. The same istrue for a second crystal form of TBE sE (Rey and Harrison). In bothcases, the hinge angle is quite small, because a larger bend woulddisrupt the dimer contact at the tip of domain II and expose the fusionpeptide. Indeed, it is just such a disruption that occurs at low pH.

In the pH-threshold mutations, substitution of longer hydrophobic sidechains by shorter ones generally leads to fusion at lower pH. We suggestthat shorter side chains may allow a tighter and more stable closed formof the pocket, requiring a greater drop in pH to flip it open.Attenuated viruses with single mutations in the k1 hairpin region havebeen obtained by passage in cell culture (Lee et al., 1997; Monath etal., 2002). Accumulation of such mutations might result in even strongerattenuation.

The outer surfaces of mature flavivirus particles contain 180 subunitseach of E and M, in a compactly organized icosahedral array (Lindenbachand Rice, 2001). Any conformational change in E is therefore likely toinduce a concerted reorganization across the entire surface of thevirion. The E proteins cluster into trimers when they undergo theirconformational change induced by low pH (Allison et al., 1995). Imagereconstructions from electron cryomicroscopy of fusion-competent TBErecombinant subviral particles, which contain 60 subunits each of E andM (Ferlenghi et al., 2001), show that if domain II does hinge outwardsduring the low-pH induced transition, then a modest reorientation ofsubunits within the surface lattice will allow three of these domains toassociate (FIG. 4A). The cluster thus formed will display threefusion-peptide loops at its tip. The packing of E deduced from imagereconstructions of dengue virions (Kuhn et al., 2002) is at odds withthis simple view, however, since the 90 dimers are not related by localthreefold symmetry (FIG. 4B). Rossmann and co-workers have suggestedthat the surface proteins might rearrange to the structure shown in FIG.4C as part of the low-pH induced reorganization (Kuhn et al., 2002).Note the similarities between the structures shown in FIGS. 4A (left)and 4C. As domain II bends outward, it will release many of thesurface-lattice packing constraints, giving individual E subunits (orgroups of subunits) considerable lateral freedom. The very tight packingof subunits in the surface of the virion at neutral pH may thereforenot, in practice, be a hindrance to the postulated rearrangement.

The structure also suggests the possibility of a second druggableregion, the “domain 1-3” druggable region, at which drug binding mightinhibit fusion. That site, located between domains 1 and 3, is boundedby the following residues: 38-40; 143-147; 294-296; and 354-365. A smallmolecule bound in the pocket defined by those segments of polypeptidechain could stabilize the conformation of the E protein seen in ourstructure and thereby inhibit a transition to the fusion-activeconformation.

In conclusion, we have identified the k1 hairpin as a key structuralelement for initiating the low-pH conformational change that leads toformation of fusion-competent trimers. The opening up of aligand-binding pocket just at the locus of a likely hinge suggests thatcompounds inserted at this position might hinder further conformationalchange and hence modulate the fusion transition. In the context of thevirion surface, their action might resemble that of some of thewell-studied anti-picornaviral compounds, which block a concertedstructural transition in the icosahedral assembly (Smith et al., 1986).Our structural observations suggest direct ways to search for suchmodulators.

Example 2 Determination of the Post-fusion Structure of E Protein fromDengue Virus Type S1

A. Expression, Purification and Crystallization of E Protein from DengueVirus Type 2 S1

Soluble E protein (sE) from dengue virus type 2 S1 strain was suppliedby Hawaii Biotech. The protein was expressed in Drosophila melanogasterSchneider 2 cells (obtained from ATCC) using a pMtt vector(GlaxoSmithKline) containing the dengue 2 prM and E genes (nucleotides539-2121 of the sequence) as described in Section A of Example 1. Theresulting prM-E preprotein is processed during secretion to yield sE,which was purified from the cell culture medium by immunoaffinitychromatography.

Dengue sE trimers were obtained as follows, based on a method developedfor tick-borne encephalitis virus sE (Stiasny, K., et al. J. Virol. 76,3784-3790 (2002)). 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine,1-palmitoyl-2-oleoyl-sn-glycero-3-phosphoethanolamine (Avanti PolarLipids) and 1-cholesterol (Sigma) were dissolved in chloroform, mixed ina 1:1:1 molar ratio, and dried under high vacuum for at least 4 h. Thelipid film was resuspended in 10 mM triethanolamine (TEA) pH 8.3, 0.14 MNaCl and subjected to five cycles of freeze-thawing, followed by 21cycles of extrusion through two 0.2 μm polycarbonate filter membranes(Whatman). Purified sE was added in a 1:680 protein:lipid molar ratioand incubated at 37° C. for 5 min. The pH was lowered to endosomallevels by adding 75 mM MES pH 5.4, and the protein was incubated at 37°C. for 30 min. Liposomes were solubilized with a 20-fold molar excessn-octyl-β-D-glucoside (β-OG) and 4 mM n-undecyl-β-D-maltoside(UDM)(Anatrace). Excess lipid was removed by cation exchangechromatography with MonoS (Pharmacia) in 25 mM citric acid pH 5.26, 70mM NaCl, 4 mM UDM. After washing with 0.4 M NaCl, the protein was elutedwith 1-1.5 M NaCl. E trimers were further purified by gel filtration ona Superdex 200 column (Pharmacia) in 8 mM TEA pH 7, 80 mM NaCl, 3 mMUDM. The sE trimers were concentrated to about 15 mg ml⁻¹ forcrystallization and dialyzed against the gel filtration buffer using a50-kDa molecular-mass cutoff membrane (Spectrapor).

Crystals grew at 20° C. by hanging drop vapor diffusion by mixing equalvolumes of protein solution and the following reservoir solution: 20-30%polyethylene glycol 400 (PEG400), 0.1 M MOPS pH 7-8 or Tris pH 8-9, 80mM NaCl. Two crystal forms were obtained: plates of space group P321with cell dimensions a=b=76.2 Å, c=131 Å, and rhomboids of space groupP3₂21 with cell dimensions a=b=153 Å, c=143 Å. The asymmetric unit ofthe P321 crystals contains one molecule of sE; that of the P3₂21crystals, one trimer (three molecules) of sE.

B. Data Collection and Processing

Cryoprotection was achieved by raising the concentration of PEG400 to30%. Crystals were frozen in liquid nitrogen, and all data werecollected at 100 K on BioCARS beamline 14-BM-C at the Advanced PhotonSource (Argonne National Laboratory). The data were processed with HKL.Data collection statistics are presented in Table 2.

TABLE 2 Crystallographic data and refinement statistics. Dataset (spacegroup) P321 P3₂21 Molecules per asymm. unit 1 3 Cell edges a(=b)/c76.2/131 153/143 Resolution range (Å) 30-2.0 30-3.25 % completeness 95(75) 99 (98) I/σ(I) 24.5 (2.0)  14.3 (3.0)  R_(merge) ^(,†) (%)  5.8(39.8) 13.2 (54.3) Unique reflections 27,450 30,779 R_(cryst) 0.2213R_(free) ^(†) 0.2671 Average B-factor (Å²) Protein (chain A/B/C) 29.6Solvent 34.4 Rmsd bond length (Å) 0.006 Rmsd bond angle (°) 1.410 Rmsdbonded B-factor (Å²) Main chain 3.29 Side chain 4.72 Rmsd (trimer-dimer)(Å) Domain I 5.69 Domain II 2.61 Domain III 2.21 Ramachandran plot (%)Favored 88.9 Allowed 10.2 Generous 0.9 Disallowed 0 Rmsd, root meansquare difference. R_(cryst) = Σ_(hkl)||F_(obs)| −|<F_(calc)>||./Σ_(hkl)|F_(obs)|. ^(†)R_(free) = R_(cryst) using 5% ofF_(obs) sequestered before refinement.

C. Structure Determination and Refinement

The crystal structure of dengue E in the post-fusion conformation wasdetermined by molecular replacement using individual domains from thepre-fusion dengue E structure (described in Example 1) (Protein DataBank code 1OKE) as search models, and the P321 dataset (Table 2). DomainII was placed first, followed by domain I, with AmoRe. Domain III wasplaced last, with CNS. The atomic coordinates of the three domains wererefined as rigid bodies. The model was rebuilt with 0 based on2F_(o)-F_(c) and F_(o)-F_(c) Fourier maps. Residues 1-17, 34-40, 49-54,128-137, 165-192, 290-299 and 341-346 were built de novo. Coordinateswere then refined against data up to 2.0 Å resolution by simulatedannealing using torsion angle dynamics with CNS, and rebuilt with O, initerative cycles. Later cycles included restrained refinement ofB-factors for individual atoms and energy minimization against maximumlikelihood targets with CNS. The final model contains residues 1-144 and159-394, an n-acetyl glucosamine glycan on residue 67, 205 watermolecules and one chloride ion. Residues 145-158 and the glycan onresidue 153 were disordered. The stereochemical quality of the atomicmodel was validated with PROCHECK. Refinement statistics are presentedin Table 2.

D. Atomic Coordinates

Coordinates have been deposited in the Protein Data Bank under accessioncode 1OK8.

E. Electron Microscopy.

Dengue sE trimers inserted into liposomes were prepared as describedabove and adsorbed to glow-discharged, carbon-coated copper grids.Samples were washed with two drops of deionized water, stained with twodrops of 0.7% uranyl formate for 20 s, washed with water, and blottedgently. Micrographs were recorded on a Philips Tecnai 12 electronmicroscope at 100 kV and 64,000-fold magnification.

F. Crosslinking.

To determine the oligomerization state of sE by SDS-PAGE, sE wascovalently cross-linked with ethylene glycol bis-(succinimidylsuccinate) (EGS). 10 μM-1 mM EGS was added from a fresh 0.1 M stocksolution in dimethyl sulfoxide to about 5 μg E at 10 μg ml⁻¹. Acidicsolutions were neutralized with TEA pH 8.5. After 30 min at roomtemperature, EGS was quenched with 20 mM Tris for 15 min. Protein wasprecipitated with trichloroacetic acid and resuspended in SDS-PAGEsample buffer for gel electrophoresis.

G. General Description of Structure and Druggable Regions

The crystal structure we describe here, of the soluble ectodomain ofdengue virus type 2 E protein (sE) in its trimeric, post-fusionconformation, provides valuable insight into the mechanism of fusion.The fusion loops of the three subunits come together to form amembrane-insertable, “aromatic anchor” at the tip of the trimer. Thefusion loop retains its pre-fusion conformation. Neighboring hydrophilicgroups will restrict insertion to the proximal part of the outerlipid-bilayer leaflet. The entire ectodomain of the protein folds backon itself, directing the C-terminal, viral membrane anchor toward thefusion loop. The fusion loop may serve as a druggable region, forexample, as a target for molecules that inhibit its interactions.

Comparison with the pre-fusion structure of the same protein allows usto propose a mechanism for fusion driven by an essentially irreversibleconformational change in the protein and assisted by membranedistortions imposed by fusion-loop insertion. Specific features of thefolded-back structure suggest strategies for inhibiting flavivirusentry.

Membrane Insertion and Trimer Formation

Like its TBE homolog, the dimer formed by dengue sE (residues 1-395 ofE) dissociates reversibly. At acidic pH, dissociation is essentiallycomplete at protein concentrations of 1 mg ml-1; at neutral pH, thedissociation constant is one to two orders of magnitude smaller. Thefusion loop at the tip of domain II would be exposed in the monomer, butexposure does not cause non-specific aggregation of the protein.Liposome coflotation experiments show that the fusion loop of monomericTBE sE allows association with lipid membranes and that this membraneassociation catalyzes irreversible formation of sE trimers at low pH.Dengue E exhibits an identical behavior: upon acidification, sE dimersdissociate, bind liposomes and trimerize. Membrane associated sE isreadily detected by electron microscopy of negatively stainedpreparations (FIG. 6A); chemical cross-linking confirms that the proteinhas trimerized (FIG. 6B). The trimers are tapered rods, about 70-80 Ålong and 30-50 Å in diameter, with the long axis perpendicular to themembrane and their wide end distal. They tend to cluster on the liposomesurface, often forming a continuous layer. These heavily decorated areasappear to have a greater than average membrane curvature, resulting insmaller vesicles (FIG. 6A). This observation suggests that E trimers caninduce curvature, a property that may be significant for promotingfusion (see below). The dengue sE trimers can be solubilized with thedetergent n-octyl-β-D-glucoside (β-OG); they remain trimeric at all pHsbetween 5 and 9, as determined by gel filtration chromatography.

Structure of the Trimer: Domain Rearrangements

The three domains of the sE retain most of their folded structures, butundergo major rearrangements in their relative orientations, throughflexion of the interdomain linkers (FIG. 7). Domain II rotatesapproximately 30° with respect to domain I, about a hinge near residue191 and the k1 hairpin (residues 270-279), where mutations that affectthe pH threshold of fusion are concentrated. As a result of therotation, the base of the k1 hairpin is pulled apart, and the 1 strandforms a new set of hydrogen bonds with the D0 strand of domain I,shifted by two residues from the hydrogen bonding pattern in the dimer.Although detergent is present, the k1 hairpin does not adopt the openconformation seen in the dimer with bound β-OG7. The small hydrophobiccore beneath this hairpin seems to be a “greased hinge” for the rotationbetween domains I and II. The k1 hairpin region and the hydrophobic corebeneath it may serve as a druggable region.

Domain III undergoes the most significant displacement in thedimer-to-trimer transition. It rotates by about 70°, and its center ofmass shifts by 36 Å towards domain II. This folding-over brings theC-terminus of domain III (residue 395) 39 Å closer to the fusion loopand positions it at the entrance of a channel, which extends toward thefusion loops along the intersubunit contact between domains II (FIG. 8A,B). The 53-residue “stem” connecting the end of the sE fragment with theviral transmembrane anchor could easily span the length of this channel,even if the stem were entirely α-helical. By binding in the channel, thestem would contribute additional trimer contacts with domain II ofanother subunit (FIG. 8B). The stem does indeed promote trimer assemblyeven in the absence of liposomes. In the virion, the stem appears toform two a-helical segments, which lie in the outer surface of the lipidbilayer and contact the membrane-facing surface of the subunit fromwhich they emanate. The stem, or portions thereof, may serve as adruggable region. Further, areas in which the stem binds, such as thechannel, may also serve as druggable regions.

Changes in the Secondary Structure

The 10-residue linker between domains I and III accommodates their largerelative displacement during trimer formation. The linker, which has apoorly ordered, extended structure in the pre-fusion dimer (FIG. 7A),inserts in as a short β-strand between strands A0 and C0 in domain I(FIG. 7B). As part of this rearrangement, the C-terminal region of A0peels away from C0 and switches to the other β-sheet, thereby creatingthe surface for an annular trimer contact with the two other A0 strandsin the trimer.

The transition to the trimer state is irreversible. It may represent thestep at which virus and host-cell membranes are forced together topromote fusion. The refoldings just described may impartirreversibility. They resemble in some respects another well-knownirreversible protein refolding, activation of serpins, in which aβ-strand also inserts between two other strands in a previously formedsheet. The chain rearrangements in dengue E can contribute a highbarrier to initiation of trimerization (sE monomers do not trimerize atlow pH without liposomes) and the even higher barrier to dissociation oftrimers once they have formed.

Trimer Contacts

Dengue E trimers assemble through both polar and nonpolar contacts infour areas: at the membrane-distal end of the trimer, at the base ofdomain II, at the tip of domain II, and at the packing interface betweendomains I and II (FIG. 8). The total surface buried per monomer duringtrimer assembly is 3900 Å2—twice the 1950 Å2 per monomer buried in thedimer. An additional 1035 Å2 are buried within each monomer during thedomain rearrangements observed in our structure. These numbers helpexplain why trimers are much more stable than dimers in solution. Theyalso help account for the irreversibility of the fusion-activatingconformational change. Additional trimer contacts are likely to becontributed by the stems, as described above.

An extended cavity, which runs along the threefold axis, separates thetrimer contact areas at the top and at the base of domain II. A narrowopening connects this cavity with the exterior solvent, but it may beoccluded by the stem in the full-length protein (FIG. 8B). An anion,modeled as a chloride, lies on the threefold axis near the tip of domainII. It is liganded by three amide nitrogen (from Lys110 on each of thesubunits) and by three water molecules. Between this anion and thedomain-II tip, a small hydrophobic core underpins the nonpolar,bowl-shaped apex formed by the three clustered fusion loops.

The Fusion Loop

The fusion loops in the sE trimer have the same conformation as in thedimer. Because the trimers are obtained by detergent extraction fromliposomes, we conclude that this conformation is also present when theloop inserts into a membrane. Furthermore, as dimers can dissociatereversibly, the fusion loop is stable when fully exposed. In short, itappears that the fusion loop retains essentially the same conformation,whether buried against another subunit, inserted into a lipid membrane,or exposed to aqueous solvent.

In the trimer, the three hydrophobic residues in the fusion loopconserved among all flaviviruses—Trp101, Leu107 and Phe108—are fullyexposed on the molecular surface, near the threefold axis. They form abowl-like concavity at the trimer tip, with a hydrophobic rim (FIG. 8E).There are no lipid or detergent molecules visible in the electrondensity near the fusion loop in either of our crystal forms. Indeed, inthe P321 crystal form, there can be no detergent micelle covering thefusion loop, as this region is involved in close crystal contacts withresidues in domain III of a symmetry-related molecule. We conclude thatdetergent is required to dissolve away the liposome on which the trimerformed and hence to solubilize the protein, but that once the proteinhas been extracted from the membrane, the threefold-clustered fusionloops do not retain a tightly associated detergent micelle.

How deeply, then, do fusion loops penetrate into the membrane?Tryptophan side chains tend to appear in membrane proteins at theinterface between the hydrocarbon and head-group layers of the lipid18,but if the indole amine participates in a hydrogen bond, as is the casefor Trp101, the side chain may be completely buried in the hydrocarbonlayer. We therefore propose that the E trimers penetrate about 6 Å intothe hydrocarbon layer of the target membrane. They cannot penetratefurther, because of exposed carbonyls and charged side chains on theoutside rim of the fusion-loop bowl (FIG. 8E). Thus, the fusion loop isheld in the membrane mainly by an “aromatic anchor” formed by Trp101 andPhe108. The bowl is lined by the hydrophobic side chains of Leu107 andPhe108, so that it cannot accommodate lipid headgroups. We expect thatfatty-acid chains from the inner leaflet of the membrane may extendacross to contact the base of the fusion-loop bowl, or that fatty-acidchains from the outer leaflet may bend over to fill it. In either case,insertion will produce a distortion in the bilayer, probably leading topositive curvature. Distortion of this type could be important for thefusion process (see below).

The sE Trimer Represents a Post-fusion Conformation

The folding back of domain III and the rearrangement of β-strands at thetrimer interface projects the C-terminus of sE toward the fusion loop,and the most likely model (FIG. 8B, discussed above) has the 53-residueC-terminal stem running along the channel between domains II of adjacentmonomers. The proposed stem conformation places the viral transmembranedomain in the immediate vicinity of the fusion loop, just as in thepost-fusion conformations of class I viral fusion proteins, such asthose of influenza virus and HIV. We therefore believe that the trimerwe have crystallized represents a post-fusion state of the protein.

Mechanism of Membrane Fusion

The structure of the sE trimer described here suggests howconformational changes in the flavivirus E protein can promote membranefusion (FIG. 9).

(1) E associates with a cell-surface receptor, probably through domainIII (FIG. 9A), but there is evidence for glycan-mediated interactions aswell. Receptor binding leads to endosomal uptake. Domain III may serveas a druggable region.

(2) Reduced pH in the endosome causes the E dimers on the virion surfaceto dissociate, exposing the fusion loops and allowing domain I and II toflex relative to one another (FIG. 9B). Evidence for a pH-dependenthinge at the domain I-domain II interface includes the location ofmutations that alter the pH-threshold of fusion, as well as thedifference in orientation between the pre- and post-fusion structures.Release of the constraints imposed by dimer contacts may also allow thestem to extend away from the membrane. Some combination of these twosources of flexibility will allow domain II to turn outward, away fromthe virion surface, and to insert its fusion loop into the target-cellmembrane. The pH-dependent hinge may serve as a druggable region.

(3) Outward projection of domain II will destroy tight packinginteractions on the virion outer surface, allowing lateral rearrangementof E monomers. Thus, the absence of trimer clustering in the virion isnot, in principle, a barrier to trimer formation. Trimerization throughdomain II might occur before or after interaction of the fusion loopwith the target cell. Liposome binding is necessary for sE to trimerize,but it is not essential for trimerization of longer E polypeptides.Because at this stage in fusion, the stem of E is probably not free toparticipate in trimer formation, and because domain III may still beconstrained by receptor contacts, we believe that target membranes areprobably required to catalyze trimerization. Whatever the precise orderof events, we propose that the combination of fusion-loop insertion andtrimer interactions among domains II leads to a pre-fusion intermediate,in which the trimer bridges host-cell and viral membranes, with itsfusion loops bound to the former and its transmembrane tail anchored inthe latter (FIG. 9C). This species is analogous to the “pre-hairpin”intermediate postulated for class I viral fusion mechanisms. The regionsof domain II involved in trimerization may present a druggable region.

(4) Formation of trimer contacts spreads from the fusion loops at thetrimer tip to domain I at the base. Domain III shifts and rotates,folding the C-terminal part of E back toward the fusion loop (FIG. 9D).The length of the interdomain linker permits independent rotation ofindividual domains II, allowing for the spontaneous symmetry-breakingrequired at this point. Cooperativity and irreversibility occur onlywhen the exchange of β-strands shown in FIG. 7D locks in the finaldomain-I trimer interaction and the final folded-back position of domainIII. Free energy released by this refolding can drive the two membranesto bend toward each other. A ring of trimers is presumably neededproperly to deform the membrane. We cannot yet specify the number oftrimers in such a ring nor how their conformational changes are coupled.It is possible, however, that coupling is provided simply by theresistance of the membranes to deformation: only when several trimersact in concert can folding back reach the barrier of o-strand exchange.

(5) In the final state, the trimer has reached the conformation seen inour crystal structure, with the stems (not present in our currentcrystals) docked along the surface of domains II and with the fusionloops and transmembrane anchors now next to each other in the fusedmembrane (FIG. 9E). The stem-domain II contact regions may presentdruggable regions.

When membranes fuse, the two lipid bilayers—the “substrates” of thefusion reaction—must undergo a sequence of deformations. Formation of a“hemifusion stalk”, with proximal leaflets fused and distal leafletsunfused, is thought to be an essential intermediate, followed by atransition to a lipidic fusion pore when distal leaflets merge. Specificmodels for the stalk differ substantially. Where along the pathway ofprotein rearrangement just described does a hemifusion stalk form, andwhat stimulates its transformation to a pore? We offer the followingsuggestions. (1) To initiate fusion, portions of each bilayer mustapproach each other to within a distance of 10 Å. The two membranes mayform apposing “domes” or “nipples” to allow room for the fusionproteins, as illustrated in FIG. 9D. Positive bilayer curvature inducedby fusion-loop insertion might stabilize the lateral surfaces of such aprotrusion. (2) Hemifusion could occur at any point during the processrepresented by FIG. 9D, depending on the length of the hemifusion stalk.It seems to us most likely that hemifusion would happen during orfollowing the β-strand exchange step that locks domains III into theirtrimer positions. It must, of course, precede final zippering up of thestems, as full pore formation must occur before the transmembranesegments can reach their likely final positions around the periphery ofthe fusion loops. (3) Hemifusion stalks can “flicker” open into narrowfusion pores. Migration of the transmembrane segments along a transientpore will prevent its closing. Thus, if the transmembrane segments (orthe stretch of polypeptide chain leading into them) “snap” into placearound the tips of domains II, formation of the symmetrical finalstructure (FIG. 9E) could drive the transition from stalk to pore.

Comparison with Class I Fusion

Despite their very different molecular architectures, the class I andclass II viral fusion proteins clearly have some common mechanisticfeatures (FIG. 9). The most striking of these is a folding back of theprotein during the fusion transitions, so that its two membraneattachment points come together in the post-fusion structure. Class Iproteins fold back by zippering up an “outer layer” (at least partlya-helical) around a central, trimeric coiled-coil (reviewed in ref. 1).Our structure of trimeric dengue sE shows that class II proteins do soby nucleating trimer formation around an elongated, finger-like fusiondomain, by rearranging two other domains, and (probably) by zippering anextended C-terminal stem along the trimer surface.

Class II viral fusion proteins form trimers from monomers (dissociatedhomodimers in the case of flaviviruses; dissociated heterodimers in thecase of alphaviruses35), while class I proteins are trimeric in theirpre-fusion state. But comparison of the pre- and post-fusion states ofinfluenza haemagglutinin—the only previous case where both structuresare known for the same protein—shows that most of the trimer contacts inthe latter state are not present in the former. That is, just as in thetrimerization of dengue E, the important trimer interactions in thefinal state form during the transition. These contacts are, of course,close to the threefold axis, and they must be present before zipperingup of an outer layer can occur. Indeed, the postulated pre-fusionintermediate is, both for class I fusion proteins and now for class II,a structure in which these central trimer contacts have formed but thezippering up of the outer layer has not yet begun (FIG. 9).

Is our structure for the membrane-inserted state of the flavivirusfusion loops relevant also for class I fusion-peptide insertion? An NMRstructure of an isolated, 20-residue influenza virus A fusion peptideassociated with a detergent micelle suggests a slightly kinked α-helix,with its N- and C-termini embedded in the outer leaflet and the kink (atabout residue 10) on the surface. Unlike the flavivirus and alphavirusfusion loops, however, the class-I fusion peptides have no particularsequence conservation. Indeed, the Ebola virus fusion peptide begins atthe 23rd residue of GP2, rather than at the N-terminus37, and a cysteinepreceding the fusion peptide probably makes a disulfide bond with acysteine C-terminal to it. Thus, whatever its conformation, this peptidemust enter and leave the membrane from the same (external) face.Available data for class I fusion peptides are thus consistent with oneimportant feature of our structure and of the SFV E1 post-fusionstructure—insertion only into the outer bilayer leaflet.

Insertion only into the outer leaflet is also consistent with therequirement of a complete C-terminal transmembrane anchor on influenzaHA or Simian virus 5 for full fusion to take place. As illustrated byFIG. 9E, one of the two membrane attachment structures must span abilayer to stabilize a fusion pore. This appears to be the C-terminalanchor for class I fusion, as well a for class II.

Example 3 Inhibitors of Flavivirus Entry

The discovery of a hydrophobic ligand-binding pocket beneath the k1 loopin the pre-fusion structure of dengue sE has suggested one possiblestrategy for inhibiting flavivirus entry by interfering with the fusiontransition. The rationale for that proposal is enhanced by our newstructure, which shows that significant rearrangements do occur aroundthe k1 loop during the conformational change. The trimer structure alsosuggests a second strategy for interfering with fusion, related to anapproach successful in developing an HIV antiviral compound. Peptidescorresponding to the C-terminal region of the gp41 ectodomain inhibitHIV-1 entry, probably by binding to the trimeric, N-terminal “innercore” of the protein and interfering with the folding back of theC-terminal “outer-layer” against it. An analogous strategy may bepossible with some class II viral fusion proteins, such as those ofdengue and hepatitis C. The way in which the stem is likely to fold backsuggests that peptides derived from stem sequences could blockcompletion of the conformational change, by interacting with therelevant surfaces on the clustered domains II. This approach wouldinterfere with the second stage of the conformational change, whiletargeting the pocket beneath the k1 loop would probably interfere withthe first stage. The two would thus be usefully complementary.

A. Stem Peptide Inhibitor Corresponding to Residues 396-429 of the StemRegion of E.

A peptide corresponding to residues 396-429 (in the “stem” region) ofdengue envelope protein (E) binds with fairly high affinity andspecificity to the trimeric, post-fusion form of sE, the fragment of Espanning residues 1-395, which we crystallized first in the pre-fusionform and then in the post-fusion form. We determined the dissociationconstant (Kd) of the peptide from sE from fluorescence depolarizationmeasurements using a fluorescently labeled version of the peptide. TheKd is around. 6 μM. This indicates a fairly strong binding. FIG. 10depicts the fluorescence depolarization results. The Kd is approximatelyequal to the concentration of sE at which the depolarization (in mP)reaches its half-maximal value (see FIG. 10A). FIG. 10B shows thatfluorescently labeled peptide can be competed off of sE with unlabeledpeptide. This is important because it demonstrates that the binding isspecific to one site of sE, and is not due to several different weaknon-specific sites. The sequence of the peptide (residues 396-429) is:

SEQ ID NO: 3: SSIGQMFETTMRGAKRMAILGDTAWDFGSLGGVF.

B. Stem Peptide Inhibitor Corresponding to Residues 413-447 of the StemRegion of E.

Another stem region-derived polypeptide binds trimeric (postfusion)_(s)Ewith slightly higher affinity than the polypeptide comprising residues396-429. This stem peptide includes residues 413-447 of sE (whereas theentire stem spans 396-447). The protein sequence for the new stempeptide (413-447) used in the attached graph is:

SEQ ID NO: 4: AILGDTAWDFGSLGGVFTSIGKALHQVFGAIYGAA

The Kd is approximately 4 μM (FIG. 11), compared to 6 μM for 396-429(FIG. 10A). This polypeptide, as well as the polypeptide comprisingresidues 396-429, may comprise good starting points for peptide orpeptidomimetic drugs.

Accordingly, the present invention is directed toward inhibitorscomprised of SEQ ID NO: 3 and SEQ ID NO: 4, as well as fragments,homologs, variants, orthologs, and peptidomimetics thereof. Suchinhibitors may have at least about 80%, at least about 85%, at leastabout 90%, at least about 95%, about 96%, about 97%, about 98% or about99% homology with either SEQ ID NO:3 or SEQ ID NO: 4.

These polypeptides bind in a channel formed at the trimer interfaceformed by domain II of each subunit in the trimer. Domain II consists ofresidues 52-132 and 193-280. Hence, the channel formed at the trimerinteferace comprises a druggable region of the invention. Further, thepresent invention is directed towards inhibitors that interact with therelevant surfaces on channel, so that completion of the conformationalchange is inhibited and thereby the activity of the dengue virus Eprotein or other E protein is inhibited.

The present invention is also directed towards an inhibitor thatinteracts with the pocket beneath the k1 loop to infere with the firststage of the conformational change, thereby modulating the activity ofthe dengue virus E protein or other E protein. Such inhibitors may beused in complementary approaches to treat dengue viral or other viralinfections.

Example 4 Druggable Regions

Based in part on the structural and inhibitor data described above inExamples 1-3, in one aspect, the present invention is directed towardsdruggable regions of a dengue virus E protein or other flavivirus Eprotein comprising the majority of the amino acid residues contained ina subject druggable region. Such druggable regions may be utilized inthe structure determination, drug screening, drug design, and othermethods described and claimed herein. In another aspect, the presentinvention is directed toward an modulator that interacts with suchdruggable regions. In still another aspect, the present invention isdirected toward an modulator that is a fragment of (or homolog of suchfragment or mimetic of such fragment) the druggable region of a denguevirus E protein or other viral class II E protein and competes with thatdruggable region.

In one embodiment, the druggable region is comprised of the k1 hairpinor a portion thereof. In certain embodiments, the k1 hairpin may becomprised of at least one of residues 268-280 of a dengue virus Eprotein or the homologous residues in other class II E protein. In otherembodiments, the druggable region or active site region may be comprisedof the k1 hairpin and at least one of residues 47-54, 128-137, and187-207. In another aspect, the present invention is directed towards amodulator that interacts with the k1 hairpin so as to preclude it frommoving, thereby modulating the activity of the dengue virus E protein orother flavivirus E protein.

In yet another embodiment, the druggable region may comprise the regionsinvolved in the binding of residues 396-429 (the “stem” region of dengueenvelope protein E) binds to the trimeric, post-fusion form of denguevirus E protein or other flavivirus E protein. In one embodiment, thedruggable region is comprised of the stem region or a portion thereof.The stem region comprises residues 396-447, or fragments thereof, forexample 396-429 and 413-447. In another embodiment, the druggable regionis comprised of the channel in which the stem region binds. The channelis comprised of the residues at the trimer interface formed by domain IIof each subunit in the trimer. Domain II consists of residues 52-132 and193-280. A second region is the channel where the stem binds, formed byresidues in domain II. In another aspect, the present invention isdirected towards a modulator that interacts with the stem region or thechannel so as to preclude them from interacting, thereby modulating theactivity of the dengue virus E protein or other flavivirus E protein.Such modulators may be, as described above, derived from either the stemregion or the channel, and compete with the stem region or channel forbinding.

In another embodiment, the druggable region is comprised of the domainI-III region. In certain embodiments, the domain I-III region may becomprised of at least one of residues 38-40; 143-147; 294-296; and354-365 of a dengue virus E protein or the homologous residues in otherclass II E protein. In another aspect, the present invention is directedtowards a modulator that interacts with the domain 1-3 region so as topreclude it from moving, thereby modulating the activity of the denguevirus E protein or other E protein. In other embodiments, the druggableregion may be comprised of the domain I-domain III linker (residues294-301).

In yet another embodiment, a druggable region is comprised of the fusionloop or a portion thereof. In another aspect, the present invention isdirected towards a modulator that interacts with the fusion loop so asto preclude it from moving, thereby modulating the activity of thedengue virus E protein or other E protein.

Other regions of protein may in certain embodiments comprise a druggableregion. For example, the hydrophobic core beneath the k1 hairpin or aportion thereof may comprise a druggable region. In another example, adruggable region may comprise domain II or a portion thereof. In stillanother example, a druggable region may comprise domain III or a portionthereof. In other examples, the pH-dependent hinge may serve as adruggable region. Further, a region or portion of a region of the Eprotein involved in trimerization, such as for example, the regions ofdomain II involved in trimerization, may present a druggable region. Aregion or a portion of a region involved in the stem fold backconformational change may comprise a druggable region, for example, suchregions as the stem-domain II contact regions, the trimeric N terminalinner core, and C terminal outer layer surfaces on the clustered domainsII, as well as the 53-residue stem. In certain embodiments, a druggableregion may consist of the entire fragment of the E protein spanningresidues 1-395.

Modulators of any of the above-described druggable regions may be usedalone or in complementary approaches to treat dengue viral or otherviral infections.

Equivalents

The present invention provides in part methods of screening noveldruggable regions in dengue virus envelope protein to develop modulatorsof the protein. While specific embodiments of the subject invention havebeen discussed, the above specification is illustrative and notrestrictive. Many variations of the invention will become apparent tothose skilled in the art upon review of this specification. Theappendant claims are not intended to claim all such embodiments andvariations, and the full scope of the invention should be determined byreference to the claims, along with their full scope of equivalents, andthe specification, along with such variations.

All publications and patents mentioned herein, including those itemslisted below, are hereby incorporated by reference in their entiretiesas if each individual publication or patent was specifically andindividually indicated to be incorporated by reference. In case ofconflict, the present application, including any definitions herein,will control.

-   Allison, S. L., et al. (2001) J. Virol., 75, 4268-4275; Allison, S.    L., et al. (1995) J. Virol., 69, 695-700; Brunger, A. T., et    al. (1998) Acta Crystallogr. D., 54, 905-921; Burke, D. S. and    Monath, T. P. (2001) Fields Virology, Lippincott Williams & Wilkins,    Philadelphia, 1043-1125; Collaborative Computational    Project, N. (1994) Acta Crystallogr. D., 50, 760-763; Crill, W. D.    and Roehrig, J. T. (2001) J. Virol., 75, 7769-7773; Cuzzubbo, A. J.,    et al. (2001) Clin. Diagn. Lab. Immunol., 8, 1150-1155;    Esnouf, R. M. (1997) J. Mol. Graphics, 15, 132-134; Ferlenghi, I.,    et al. (2001) Mol. Cell, 7, 593-602; Gubler, D. J. (2002) Trends    Microbiol., 10, 100-103; Guirakhoo, F., et al. (1993) Virology, 194,    219-223; Hahn, Y. S., et al. (1988) Virology, 162, 167-180;    Heinz, F. X. and Allison, S. L. (2001) Curr. Opin. Microbiol., 4,    450-455; Hung, S. L., et al. (1999) Virology, 257, 156-167; Ivy, J.,    et al. (1997) United States Patent and Trademark Office, Hawaii    Biotechnology Group, Inc., USA; Jones, T. A., et al. (1991) Acta    Crystallogr. A., 47, 110-119; Kawano, H., et al. (1993) J. Virol.,    67, 6567-6575; Kraulis, P. J. (1991) J. Appl. Crystallogr., 24,    946-950; Kuhn, R. J., et al. (2002) Cell, 108, 717-725; La    Fortelle, E. and Bricogne, G. (1997) Methods in Enzymology, 276,    472-494; Laskowski, R. A., et al. (1993) J. Appl. Cryst., 26,    283-291; Lee, E., et al. (1997) Virology, 232, 281-290; Lescar, J.,    et al. (2001) Cell, 105, 137-148; Lindenbach, B. D. and    Rice, C. M. (2001) Fields Virology, Lippincott Williams and Wilkins,    Philadelphia, 991-1041; Merritt, E. A. and Bacon, D. J. (1997),    Methods in Enzymology, 277, 505-524; Monath, T. P., et al. (2002) J.    Virol., 76, 1932-1943; Navaza J. (2001) Acta Crystallogr. D., 57,    1367-1372; Otwinowski, Z and Minor, W. (1997) Methods in Enzymology,    276, 307-326; Pletnev, A. G., et al. (1993) J. Virol., 67,    4956-4963; Rey, F. A., et al. (1995) Nature, 375, 291-298;    Skehel, J. J. and Wiley, D. C. (2000) Annu. Rev. Biochem., 69,    531-569; Smith, T. J. et al. (1986) Science, 233, 1286-1293;    Terwilliger, T. C. (1999) Acta Crystallogr. D., 55, 1863-1871;    Terwilliger, T. C. and Berendzen J. (1999) Acta Crystallogr. D., 55,    849-861; Weissenhorn, W., et al. (1999) Mol. Membr. Biol., 16, 3-9-   Skehel, J. J. & Wiley, D. C. (2000) Annu. Rev. Biochem. 69, 531-569;    Wilson, I. A., Skehel, J. J. & Wiley, D. C. (1981) Nature 289,    366-373; Bullough, P. A., Hughson, F. M., Skehel, J. J. &    Wiley, D. C. (1994) Nature 371, 37-43; Chen, J., Skehel, J. J. &    Wiley, D. C. (1999) Proc. Natl. Acad. Sci. U.S.A. 96, 8967-8972;    Rey, F. A., Heinz, F. X., Mandl, C., Kunz, C. &    Harrison, S. C. (1995) Nature 375, 291-298; Lescar, J. et al. (2001)    Cell 105, 137-148; Modis, Y. & Harrison, S. C. (2003) Proc. Natl.    Acad. Sci. U.S.A. 100, 6986-6991; Allison, S. L. et al. (1995) J.    Virol. 69, 695-700; Ferlenghi, I. et al. (2001) Mol. Cell 7,    593-602; Kuhn, R. J. et al. (2002) Cell 108, 717-725; Allison, S.    L., Schalich, J., Stiasny, K., Mandl, C. W. & Heinz, F. X. (2001) J.    Virol. 75, 4268-4275; Levy-Mintz, P. & Kielian, M. (1991) J. Virol.    65, 4292-4300; Ahn, A., Gibbons, D. L. & Kielian, M. (2002) J.    Virol. 76, 3267-3275; Stiasny, K., Allison, S. L., Schalich, J. &    Heinz, F. X. (2002) J. Virol. 76, 3784-3790; Allison, S. L.,    Stiasny, K., Stadler, K., Mandl, C. W. & Heinz, F. X. (1999) J.    Virol. 73, 5605-5612; Zhang, W. et al. (2003) Nat. Struct. Biol. in    press; Carrell, R. W., Stein, P. E., Fermi, G. & Wardell, M. R.    (1994) Structure 2, 257-270; Wimley, W. C. & White, S. H. (1992)    Biochemistry 31, 12813-12818; Crill, W. D. &    Roehrig, J. T. (2001) J. Virol. 75, 7769-7773; Jennings, A. D. et    al. (1994) J. Infect. Dis. 169, 512-518; Lobigs, M. et al. (1990)    Virology 176, 587-595; Holzmann, H., Heinz, F. X., Mandl, C. W.,    Guirakhoo, F. & Kunz, C. (1990) J. Virol. 64, 5156-5159; Jiang, W.    R., Lowe, A., Higgs, S., Reid, H. & Gould, E. (1993) J. Gen. Virol.    74, 931-935; Gao, G. F., Hussain, M. H., Reid, H. W. &    Gould, E. A. (1994) J. Gen. Virol. 75, 609-614; Cecilia, D. &    Gould, E. A. (1991) Virology 181, 70-77; Chen, Y. et al. (1997) Nat.    Med. 3, 866-871; Navarro-Sanchez, E. et al. (2003) EMBO Rep. 4, 1-6;    Tassaneetrithep, B. et al. (2003) J. Exp. Med. 197, 823-829;    Stiasny, K., Allison, S. L., Marchler-Bauer, A., Kunz, C. &    Heinz, F. X. (1996) J. Virol. 70, 8142-8147; Chan, D. C. &    Kim, P. S. (1998) Cell 93, 681-684; Kuzmin, P. I., Zimmerberg, J.,    Chizmadzhev, Y. A. & Cohen, F. S. (2001) Proc. Natl. Acad. Sci.    U.S.A. 98, 7235-7240; Kozlov, M. M. & Chernomordik, L. V. (1998)    Biophys. J. 75, 1384-1396; Razinkov, V. I., Melikyan, G. B. &    Cohen, F. S. (1999) Biophys. J. 77, 3144-3151 Rand, R. P. &    Parsegian, V. A. (1986) Annu. Rev. Physiol. 48, 201-212;    Wahlberg, J. M., Bron, R., Wilschut, J. & Garoff, H. (1992) J.    Virol. 66, 7309-7318; Han, X., Bushweller, J. H., Cafiso, D. S. &    Tamm, L. K. (2001) Nat. Struct. Biol. 8, 715-720; Ito, H., Watanabe,    S., Sanchez, A., Whitt, M. A. & Kawaoka, Y. (1999) J. Virol. 73,    8907-8912; Kemble, G. W., Danieli, T. & White, J. M. (1994) Cell 76,    383-391; Armstrong, R. T., Kushinir, A. S. & White, J. M. (2000) J    Cell Biol 151, 425-437; Dutch, R. E. & Lamb, R. A. (2001) J Virol    75, 5363-5369; Baldwin, C. E., Sanders, R. W. & Berkhout, B. (2003)    Curr. Med. Chem. 10, 1633-1642; Kilby, J. M. et al. (1998) Nat. Med.    4, 1302-1307; Hahn, Y. S. et al. (1988) Virology 162, 167-180;    Schneider, I. (1972) J Embryol Exp Morphol 27, 353-365; Ivy, J.,    Nakano, E. & Clements, D. in United States Patent and Trademark    Office (Hawaii Biotechnology Group, Inc., U.S.A., 1997);    Cuzzubbo, A. J. et al. (2001) Clin. Diagn. Lab. Immunol. 8,    1150-1155; Otwinowski, Z. & Minor, W. (1997) Methods in Enzymology    276, 307-326; Navaza, J. (2001) Acta Crystallogr. D 57, 1367-1372;    Collaborative Computational Project, N. The CCP4 suite: programs for    protein crystallography. Acta Crystallogr. D 50, 760-763 (1994);    Brünger, A. T. et al. (1998) Acta Crystallogr. D 54, 905-921;    Jones, T. A., Zou, J. Y., Cowan, S. W. & Kjeldgaard. (1991) Acta    Crystallogr. A 47, 110-119; Laskowski, R. A., MacArthur, M. W.,    Moss, D. S. & Thornton, J. M. PROCHECK: (1993) J. Appl. Cryst. 26,    283-291.

1. A method for identifying a candidate therapeutic for a disease causedby a virus having class II E protein, comprising contacting a denguevirus class II E protein with a compound, wherein binding of saidcompound indicates a candidate therapeutic, wherein the dengue virusclass II E protein consists of an amino acid sequence at least 20 aminoacids but not greater than 394 amino acids in length having at least 85%identity along the length to the amino acid sequence of SEQ ID NO: 1 or2, wherein amino acid residue 101 is Trp, amino acid residue 107 is Leuand amino acid residue 108 is Phe when numbered in accordance with SEQID NO: 1 or
 2. 2. The method of claim 1, wherein said compound isselected from the following classes of compounds: polypeptides,peptidomimetics, and small molecules.
 3. The method of claim 1, whereinsaid disease is selected from the following group: dengue fever, denguehemorrhagic fever, tick-borne encephalitis, West Nile virus disease,yellow fever, Kyasanur Forest disease, louping ill, hepatitis C, RossRiver virus disease, and O'nyong fever.
 4. The method of claim 1,wherein said compound is in a library of compounds.
 5. The method ofclaim 1, wherein said library is generated using combinatorial syntheticmethods.
 6. The method of claim 1, wherein binding is determined usingan in vitro assay.
 7. The method of claim 1, wherein binding isdetermined using an in vivo assay.
 8. A method for identifying acandidate therapeutic for a disease caused by a virus having class II Eprotein, comprising contacting a dengue virus class II E protein with acompound, wherein the modulation of the activity of said E proteinindicates a candidate therapeutic, wherein the dengue virus class II Eprotein consists of an amino acid sequence at least 20 amino acids butnot greater than 394 amino acids in length having at least 85% identityalong the length to the amino acid sequence of SEQ ID NO: 1 or 2,wherein amino acid residue 101 is Trp, amino acid residue 107 is Leu andamino acid residue 108 is Phe when numbered in accordance with SEQ IDNO: 1 or
 2. 9. The method of claim 8, wherein said modulation of theactivity of said E protein involves precluding the movement of the k1hairpin.
 10. The method of claim 8, wherein said modulation of thefunction or activity of said E protein involves precluding the movementof the domain I-III region.
 11. The method of claim 8, wherein saidmodulation of the function or activity of said E protein involvesprecluding completion of the post-fusion conformational changes byinteracting with the domain II residues at the trimer interface formedby domain II of each subunit in the postfusion trimer.
 12. The method ofclaim 8, wherein said modulation of the function or activity of said Eprotein involves precluding interaction of the k1 hairpin with thepocket beneath the k1 hairpin.
 13. A method for identifying a candidatetherapeutic for a disease caused by a virus having class II E protein,comprising contacting a dengue virus class II E protein with a compound,wherein the inhibition of fusion in said virus indicates a candidatetherapeutic, wherein the dengue virus class II E protein consists of anamino acid sequence at least 20 amino acids but not greater than 394amino acids in length having at least 85% identity along the length tothe amino acid sequence of SEQ ID NO: 1 or 2, wherein amino acid residue101 is Trp, amino acid residue 107 is Leu and amino acid residue 108 isPhe when numbered in accordance with SEQ ID NO: 1 or
 2. 14. A method foridentifying a candidate therapeutic for a disease caused by a virushaving class II E protein, comprising contacting a dengue virus class IIE protein with a compound, wherein binding of said compound indicates acandidate therapeutic, wherein the dengue virus class II E proteinconsists of an amino acid sequence at least 20 amino acids but notgreater than 394 amino acids in length having at least 85% identityalong the length to the amino acid sequence of SEQ ID NO: 1 or 2,comprising residues 268-280 and at least one of residues 47-54, 128-137,and 187-207 when numbered in accordance with SEQ ID NO: 1 or
 2. 15. Themethod of claim 1, wherein the dengue virus class II E protein consistsof an amino acid sequence at least 20 amino acids but not greater than394 amino acids in length having the amino acid sequence of SEQ ID NO: 1or
 2. 16. The method of claim 8, wherein the dengue virus class II Eprotein consists of an amino acid sequence at least 20 amino acids butnot greater than 394 amino acids in length having the amino acidsequence of SEQ ID NO: 1 or
 2. 17. The method of claim 13, wherein thedengue virus class II E protein consists of an amino acid sequence atleast 20 amino acids but not greater than 394 amino acids in lengthhaving the amino acid sequence of SEQ ID NO: 1 or
 2. 18. The method ofclaim 14, wherein the dengue virus class II E protein consists of anamino acid sequence at least 20 amino acids but not greater than 394amino acids in length having the amino acid sequence of SEQ ID NO: 1 or2.